October 14, 2013

Java Generic Array Creation Error

Filed under: Technical — Tags: , — James Bunton @ 6:26 pm

Java Generics are a neat hack to save some typing. They have a bunch of problems though. Most of these are due to the lack of runtime type information. Today a friend asked why you can’t create do:

new ArrayList<String>[10]

The compiler gives the error error: generic array creation.

Following is a short explanation of what generics can and cannot do in Java.

Reify My Generics!

Neal Gafter’s blog: Reified Generics for Java is a good article on the problem and a possible solution.

Firstly, what do I mean by lack of runtime type information? Imagine you have a simple class:

public class MyWrapper<T> {
    private T thing;

    public MyWrapper() {
        thing = new T();
        // This is not allowed. To do this we'd need to be able to look up
        // the class 'T' at runtime to call the constructor.
        // Java doesn't store this information, so this cannot compile.
    }

    public T getThing() {
        return thing;
    }
}

As an aside you can do this in C++ because for each different T that you use with MyWrapper the compiler will generate a completely
separate class. It effectively copy/pastes your code for you and then binds to the appropriately duplicated constructor at compile time. This has advantages and disadvantages of course. Suffice to say I think C++ templates are both far more awesome and terrible than Java generics :)

What Good are They Then?

So what do generics do for you then? They save you from explicitly casting, and in some cases stop you from introducing bugs that would cause a ClassCastException to be thrown later in your code.

ArrayList a = new ArrayList();
a.add(new Foo());

// now in another part of the software we accidentally insert the wrong type
a.add("hello world");

// now much later in a completely different module of the software we may do this
for (int i = 0; i < a.size(); ++i) {
    Foo x = (Foo)a.get(i); // oh noes! ClassCastException!
    x.doThing();
}

With Java Generics:

ArrayList<Foo> a = new ArrayList<Foo>();
a.add(new Foo());
a.add("hello world"); // bam! Compile error!

Foo x = a.get(0); // slightly less typing, that's always nice.

So we lose by typing a little more when declaring our variables*. We gain by having compile time type safety upon insertion to a list and by not having to type the explicit cast when reading an element. Overall it’s an improvement.

* Note in Java 7 you can use the diamond syntax:

ArrayList<Map<String, Foo>> a = new ArrayList<>();

So Why Can’t I Have an Array with Generic Types?

import java.util.ArrayList;


// Simple class hierarchy

class Animal { }
class Dog extends Animal { }


class Shape<T> { }
class Square<T> extends Shape<T> { }


public class Foo {
    public static void main(String[] args) {
        // Arrays allow subclasses to be stored inside them.
        Animal[] a = new Animal[10];
        a[0] = new Animal();
        a[1] = new Dog();


        // The array will type-check this at runtime. This is necessary
        // because 'oa' and 'a' refer to the same object. Somebody with
        // a reference to 'a' will not expect to see Strings in their array.
        //
        // Note that the type on the LHS matters at compile time, but the RHS
        // is checked at runtime.
        Object[] oa = a;
        oa[2] = "foo";


        // Generics require exact type matches. However this is checked 
        // only at compile time!
        Shape<Animal> b = new Shape<Animal>();
        b = new Square<Animal>(); // generic type matches, this is ok
        b = new Shape<Dog>();  // not allowed, parameterised type is different
        b = new Square<Dog>(); // not allowed, parameterised type is different


        // This is perfectly safe because 'ob' is a copy of the reference
        // to 'b'. I can change 'ob' without affecting 'b' at all.
        Object ob = b;
        ob = "foo";


        // Java does not allow 'new Shape<Animal>[10]'. Pretend that it did.
        Shape<Animal>[] c = new Shape<Animal>[10];
        c[0] = new Shape<Animal>();

        Object[] oc = c;
        oc[1] = new Shape<Animal>(); // this is fine.
        oc[2] = "foo"; // this would throw ArrayStoreException, no problems
        oc[3] = new Shape<Dog>(); // should throw ArrayStoreException, but can't

        // If the oc[3] case above was allowed then you could be surprised
        // if you accessed c[3] and found a Shape<Dog> which shouldn't be there.
        // The compiler cannot check it at insert-time because Object[] is
        // allowed to contain anything.
        // The runtime cannot check it at insert-time because there's no
        // way to distinguish Shape<Animal> from Shape<Dog> at runtime.
        // So you're left with a possible class-cast exception if you try
        // to use your Shape<Dog> as if it were a Shape<Animal>.
    }
}

A Few Other Surprises

ClassCastException by Calling ArrayList.get()

The whole point of generics is that you can’t get a ClassCastException at runtime. Of course you still can if you try hard enough.

ArrayList<Animal> d = new ArrayList<Animal>();
ArrayList od = d; // compiler warning
od.add("hello world");
Animal d0 = d.get(0); // will throw ClassCastException
No Safe Cast to List<Object>

I know this is a contrived case. This came up in real life when writing code to handle marshalling and unmarshalling objects to send over the network. So it was actually realistic to have to handle arbitrary types.

public void doStringThings(Object x) {
    if (x instanceof String) {
        String y = (String)x; // ok
        System.out.println(y.substring(2));
    }
    else if (x instanceof List) {
        // Note that you cannot put List<Object> above because Java cannot
        // check it at runtime.
        
        // This is not safe because generics must match exactly.
        // It will always give a compile warning.
        List<Object> y = (List<Object>)x;

        // Wildcard generics do work though.
        List<? extends Object> z = (List<? extends Object>)x;
    }
}
No Safe Cast From Object

Say I have a Map with Class as the key and Object as the value. There’s no safe way to get things from this map.

private Map<Class<?>, Object> map;

public <T> T get1(Class<T> klass) {
    Object obj = map.get(klass);
    return (T)obj; // unchecked cast warning
}

public <T> T get2(Class<T> klass) {
    Object obj = map.get(klass);
    if (obj instanceof T) { // not valid because T is unavailable at runtime
        return (T)obj;
    }
    throw new RuntimeException("Failed to cast class " + klass.getName());
}

public <T> T get3(Class<?> klass) {
    Object obj = map.get(klass);
    if (klass.isAssignableFrom(obj.getClass())) {
        return (T)obj; // perfectly safe, but the compiler doesn't agree
    }
    throw new RuntimeException("Failed to cast class " + klass.getName());
}
Overloading Doesn’t Work
public class Foo {
    public void doThing(List x) { }
    public void doThing(List<String> x) { }
    public void doThing(List<Integer> x) { }
}

That’s not allowed because all three method compile to have the same signature. At runtime there would be no way to choose between them so the compiler disallows it.

This is a big deal because it means you can’t do things like this:


public class Foo { }

public class Bar { }

public interface MessageListener<T> {
    public void handleMessage(T msg);
}

public class Impl implements MessageListener<Foo>, MessageListener<Bar> {
    public void handleMessage(Foo msg);

    public void handleMessage(Bar msg);
}