DISCOVERY

May 13th, 2018

Complexities of Generics and Arrays in Java

Java

Generics

Inheritance

Polymorphism

Usually when I work in Java generics are easy to reason about. They enforce types on a collection or a class I created. When declaring a class with a generic, an instance of that class can be parameterized with one (or more) element types. A simple example is an ArrayList, which is implemented with a generic parameter like so:

public class ArrayList<E> extends AbstractList<E> implements List<E> { ... }

Note that ArrayList extends AbstractList and implements List. AbstractList actually implements List itself, so the definition of implements <List> is not needed. It is only there for clarity and readability.

You can create an instance of an ArrayList with a type:

List<Number> numberList = new ArrayList<Number>(); // Java 7 introduced the diamond operator, allowing the // compiler to infer the type based on the definition List<Number> numberList2 = new ArrayList<>();

Easy right? But what happens if I change the generic assignment to Integer? My first thought was it should work, since Integer is a subclass of Number:

List<Number> n2 = new ArrayList<Integer>(); // Incompatible Types

This code actually fails, saying that Number and Integer are incompatible types. This is because of a unique implementation detail of generics. Generics are invariant, meaning that two generic type parameters are neither supertypes or subtypes of one another. That means List<Integer> is not a subtype of List<Number>, they are simply not equal. Therefore, the above assignment statement is invalid.

February 3rd, 2019

While generics are invariant by default in C#, they can also be covariant and contravariant. I wrote an article about Variance with C# Generics if you want to learn more.

It is important to note that object assignments still enable polymorphism when using generics. In the example above an ArrayList was still assigned to a List.

Covariant

When a programming language type system allows variables to be assigned any type less than (subtype) or equal to the variables declared type1. In Java, obviously a type List can be assigned to a variable defined as type List such as List<Object> list = new List<>();. This follows the second part of the rule when the types are equal. You can also assign any subtype of List, such as ArrayList. This follows the first part of the rule. Arrays in Java are also covariant such that the following assignment is legal: Number[] array = new Integer[1];

Contravariant

When a programming language type system allows variables to be assigned any type greater than (supertype) or equal to the variables declared type. Java is not contravariant and the compiler throws an error when a supertype is assigned to a declaration.

Invariant

When a type in a programming language is neither a supertype or subtype of any other type. In Java generics follow invariance, so a generic type assignment can only correspond to a generic type declaration of the same type. For example the following assignment is valid: List<Number> numberList = new ArrayList<Number>();. The work around for generics invariance is generic wildcards.

Unlike generics, arrays are covariant. This sounds nice but can lead to issues around type safety. Let's look at one of the issues with covariant arrays in Java:

// Will Compile and Run, but throws a java.lang.ArrayStoreException Number[] array = new Integer[1]; array[0] = 1.6;

What happens here is you can assign an array a value of any subtype. However, when the array was initialized it was given the type Integer. While 1.6 is a valid number, it is not a valid integer. Therefore we get an unfortunate exception at runtime. Note that the code above compiles cleanly - the error is only noticeable when running the program. At that point its too late!

Generics were designed (unlike arrays) to spot errors at compile time. Therefore type safety is ensured before running a program. The big difference between generics and arrays is that generics implement erasure and are non-reifiable while arrays are reified. Because of these differences arrays and generics do not mix well. In fact, if you try to mix them it will not work.

/* You can't mix generics and arrays - the following are illegal */ List<Number>[] arr1 = new ArrayList<>()[10]; List<?>[] array = new ArrayList<String>[10];

Reified

Something abstract that is made more concrete or real. In terms of programming languages a reified variable is one that knows its type at runtime and doesn't lose any type information at runtime. The type is 'real' while the program is executing. In Java, variables that are reified know their type and also enforce it at runtime. Arrays are reified - they enforce their type at runtime and don't complain at compile time.

Non-Reifiable

The opposite of reified - something abstract which is not made concrete. In a programming language a non-reifiable variable is one that does not know or enforce its type at runtime (its type is abstract, not concrete)2. It only knows it's type at compile time. Therefore, it looses information about itself when it is compiled. Although we don't know a generics type at runtime, we know its type safe because it passed a compile time check.

Erasure

The process of removing type constraints when code is compiled. When a variable is implemented with erasure, its type is checked at compile time and not at runtime3. At runtime it does not know what type it is. This is how generics are implemented in Java.

The fact that generics are non-reifiable means only type safe code can compile. It is impossible to check the type safety of generics at runtime! The sooner a developer knows something is wrong with their code the better. Generics follow this philosophy elegantly.

When dealing with generic wildcards a lot of interesting behavior occurs. There are two types of wildcards - unbounded and bounded. Unbounded wildcards are of any type, while bounded wildcards are of any type within a specified class hierarchy.

The problem with assigning generic instances to an unbounded wildcard type is its impossible to know the type at runtime (it could be an instance of any class!). This uncertainty means a lot of operations are not allowed since they are not type safe. Luckily errors for these scenarios occur at compile time. Take an example of unbounded wildcard lists:

List<?> uwList = new ArrayList<>(); List<?> uwList2 = new ArrayList<Integer>(); /* Invalid operation */ uwList.add("hi"); /* Invalid operation */ uwList2.add(1); List<?> uwList3 = new ArrayList<String>(List.of("hi", "there")); /* Invalid operation */ String hi = uwList3.get(0); /* Valid operations */ var hi = uwList3.get(0); Object there = uwList3.get(1);

In the previous code both adding to lists and assigning a list element to a type variable are disallowed. Adding to an unbounded generic list is illegal because its impossible to know if a value of the correct type is added. Similarly there is no guarantee that a declared variable type is equal to the type of an unbounded generic list. Therefore both these operations are disallowed.

However you can assign a list element to a variable of type Object or use Java 10's var keyword.

Interestingly unbounded wildcards are reified (unlike all other generics) in the sense that they don't lose any type information at runtime. However this does not mean they know their type at runtime. The truth is unbounded wildcards never know their type (even at compile time), so there is no type information to lose in the first place!4

Interesting behavior spills over into bounded wildcards:

List<? extends Number> n4 = new ArrayList<Number>(); List<? extends Number> n5 = new ArrayList<Integer>(); List<? extends Number> n6 = new ArrayList<Object>(); // ERROR: Incompatible Types List<? extends Number> numbers = new ArrayList<>(List.of(1,4,6)); /* Invalid operation */ numbers.add(1); /* Valid operation */ Number number = numbers.get(1);

The syntax List<? extends Number> defines a generic type of Number or any subtype of Number, such as Integer. In the code above assigning the subtype Integer is valid but assigning the supertype Object is not.

The confusing part is what comes next. Adding another integer to the list fails. This is because its impossible to guarantee that numbers is actually a list of Integer, even though it was assigned Integer values. If you think about it this makes sense. The list can be a Number or any subclass. This means that is could be an integer, but it also could be a double, float, or some other complex number type. There is simply no way of guaranteeing type safety, and once the code is compiled all type information is lost.

However, it is known that the value of a list element is of type Number or any of its subtypes. Therefore it is valid to assign a list element to type Number.

Another kind of unbounded wildcard works with supertypes:

List<? super Number> n7 = new ArrayList<Number>(); List<? super Number> n8 = new ArrayList<Integer>(); // ERROR: Incompatible Types List<? super Number> n8 = new ArrayList<Object>(); List<? super Number> nums = new ArrayList<>(List.of(5, 6, 7)); /* Valid operation */ nums.add(8); /* Invalid operation */ Number num = nums.get(1);

The syntax List<? super Number> defines a generic type of Number or any supertype of Number, such as Object. In the code above using the supertype Object for the assignment is valid but using the subtype Integer is not.

Adding to the list works without error because an Integer is always a Number or any of its supertypes (such as Object). Thanks to polymorphism, an Integer is a Number and an Object.

The type in the list still isn't guaranteed, and for that reason a list item can't be assigned to type Number. For example, if the list contains Object values the assignment statement is invalid - an Object is not a Number5.

There are a lot of different complexities with generics and how they differ from arrays. I never fully understood these quirks with generics, so I am happy I did a deep dive on the topic. All the code from this discovery is on GitHub.

[1] "Covariance and contravariance (computer science)", https://en.wikipedia.org/wiki/Covariance_and_contravariance_(computer_science)

[2] Joshua Bloch, Effective Java, 3rd ed (Boston, MA: Pearson, 2018), 127

[3] Bloch., 126

[4] "Java Generics - What's really in a unbounded wildcard?", https://stackoverflow.com/a/7796793

[5] "How can I add to List<? extends Number> data structures?", https://stackoverflow.com/a/2777297