Arnt Gulbrandsen
About meAbout this blog

Compatibility break number two (of n?)

A while ago I spent a day and a half fretting over a missing checkcast in a Java/JDK file before I finally solved it. Before I finally mostly solved it.

It didn't take long until checkcast returned to hit me again from another angle.

The Java List<E> includes a method called toArray(), which returns the contents of the list as an array. toArray() is older than Java's generics, so it returns Object[] rather than an E[]. This isn't a problem on its own, because implementations of List<E> are free to return an E[].

The next part of the puzzle is ArrayList, which implements toArray() and returns an Object[]. It doesn't have to do that, but the source code uses Object[] for storage rather than E[]. The constructors could have called new E[], they do call new Object[].

ArrayList.toArray() calls Arrays.copyOf(), so the object it returns actually is an Object[]. The third, and critical, part of the puzzle is… is all over the JDK. ArrayList is used all over the JDK, and code equivalent to String[] a = new ArrayList<String>().toArray() occurs in many places, and of course works with Sunracle's JVMs.

Sometimes the JDK code includes a checkcast (but no exception handling), sometimes there's no checkcast. Either way it works… except for me.

What to do? On one hand this detail is very, very difficult for me, because it's a minor side effect of extremely important invariants in my compiler. On the other, I cannot very well refuse to compile the JDK and its two thousand instances of ArrayList, and right now this issue is making every application fail during startup.

I decided (after writing this and pacing the corridor for a good long while) to make checkcast copy the array, if the array's contents are acceptable to the new type. That seems to be the least bad option I have, but of course it means that == behaves differently from real JVMs, because the cast actually returns a copy.

At present the Object[] and the copy share the same hashCode(). I'm not sure whether that's a good idea.