This page shows the source for this entry, with WebCore formatting language tags and attributes highlighted.

Title

Wildcard Generics

Description

<n>This article was originally published on the <a href="http://blogs.encodo.ch/news/view_article.php?id=6"><b>Encodo Blogs</b></a>. Browse on over to see more!</n> <hr> As of version 1.5, Java has blessed its developers with <i>generics</i>, which increase expressiveness through improved static typing. With generics, Java programmers should be able to get away from the "casting orgy" to which Java programming heretofore typically devolved. The implementation in 1.5 does not affect the JVm at all and is restricted to a syntactic sugar wherein the compiler simply performs the casts for you. Let's build a class hierarchy and see how much casting Java saves us. Assume that you have defined a generic hierarchy using the following class: <code> public class DataObject { private String name; private List<dataobject> subObjects = new ArrayList<dataobject>(); public String getName() { return name; } public List<dataobject> getSubObjects() { return subObjects; } } </code> Well, now that's an improvement! The class can express its intent in a relatively clear syntax without creating a specialized list class for the private field and result type. Assume further that there are various sub-classes of this <c>DataObject</c>, which want to provide type-specific helper functions for their sub-lists. For example: <code> public class A extends DataObject { } public class B extends DataObject { public List getAs() { return getSubObjects(); } } </code> Though this is exactly what we would like, it won't compile. It returns instead the error: <div class="error">Type mismatch: Cannot convert from <c>List<dataobject></c> to <c>List</c></div> In the next section, we'll find out why. <h>Covariance and Catcalls</h> For some reason, <c>List</c> does not conform to <c>List<dataobject></c>, even though <c>A</c> inherits from <c>DataObject</c>. The <a href="http://java.sun.com/j2se/1.5/pdf/generics-tutorial.pdf">Generics Tutorial</a> (PDF) Section 3 explains: <bq>In general, if Foo is a subtype (subclass or subinterface) of Bar, and G is some generic type declaration, it is not the case that G<foo> is a subtype of G<bar>. This is probably the hardest thing you need to learn about generics, because it goes against our deeply held intuitions.</bq> Indeed it is hard to learn and indeed it does go against intuitions. Is there a more specific reason why generics is implemented in this way in Java? Java's competitor, C#, is limited in exactly the same way and the <a href="http://download.microsoft.com/download/8/1/6/81682478-4018-48fe-9e5e-f87a44af3db9/SpecificationVer2.doc">C# Version 2.0 Specification</a> (DOC) or the <a href="http://209.85.129.104/search?q=cache:tD7gg3n5xkAJ:download.microsoft.com/download/8/1/6/81682478-4018-48fe-9e5e-f87a44af3db9/SpecificationVer2.doc">Google HTML version</a> offers the following explanation: <bq>No special conversions exist between constructed reference types other than those described in ยง6. In particular, unlike array types, constructed reference types do not exhibit "covariant" conversions. This means that a type List has no conversion (either implicit or explicit) to List even if B is derived from A. Likewise, no conversion exists from List to List<object>. The rationale for this is simple: if a conversion to List is permitted, then apparently one can store values of type A into the list. But this would break the invariant that every object in a list of type List is always a value of type B, or else unexpected failures may occur when assigning into collection classes.</bq> The key word here is <i>covariance</i>. Neither Java nor C# supports it (except for return types, where there are no dangers involved) because of function calls that, in the <a href="http://eiffel.com">Eiffel</a> world, have long been called "catcalls". Suffice it to say that both Java and C# have elected to limit expressiveness and legibility in order to prevent this type of error from happening.<fn> <h>Making it work in Java</h> Since Java has clearly state that it neither condones nor supports what we would like to do, we can choose one of several options: <ol> Be happy with the <c>List<dataobject></c> and just go back to casting to get the desired when needed Figure out a way of getting Java to return the desired <c>List</c> without complaining </ol> Since we're stubborn, we'll go with (2) above and dig a little deeper into generics. One solution is to create the list on-the-fly and transfer all the elements over to it. <code> public List getAs() { List result = new ArrayList(); for (DataObject obj : getSubObjects()) { result.add((A) obj); } return result; } </code> Mmmmm...lovely. It does the soul good and makes the heart swell with pride to write code like this. So clear and understanable---and such a lovely mix of new-style iteration with old-style casting! Methinks we'll try again. In the first attempt, we returned <c>List<dataobject></c> from <c>getSubObjects()</c>. Is there another result type we could use? <h>Wildcards Explained</h> Java's generics include something called <i>wildcards</i>, which allow a restricted form of covariance, in which the character ? acts as a placeholder for any class type at all. Wildcards are especially useful for function arguments, where they allow any list of elements to be passed. Imagine we wanted to pass in a list of <c>DataObjects</c> to a function to be printed. Using wildcards, we can write the following: <code> public void printCollection(Collection _objects) { for (Object o : _objects) { System.out.println(o); } } </code> The example above takes an collection at all and prints all of them. It only works because the compiler knows that any class that replaces ? <i>must</i> inherit from <c>java.lang.Object</c>, so it can access any methods of that class from within the function. This is extremely limited since we can't access any <c>DataObject</c>-specific functions, so Java also includes <i>bounded wildcards</i>, which allow a wildcard to restrict the types of objects that may be used as the generic argument. Let's rewrite <c>printCollection</c> so that we can access <c>DataObject</c>'s members without casting: <code> public void printCollection(List<span class="highlight">extends DataObject</span>> _objects) { for (<span class="highlight">DataObject</span> o : _objects) { System.out.println(o<span class="highlight">.getName()</span>); } } </code> Whereas this mechanism suffices for the example above, wildcards exact a hidden price: they do not conform to anything. That is, though <c>List</c> conforms to the format parameter, <c>List</c>, you cannot then call <c>add()</c> on it. That is, the following code doesn't work: <code> public void extendCollection(List _objects) { _objects.add(new DataObject()); } </code> The parameter of <c>_objects.add()</c> is of type <c>? extends DataObject</c>, which is completely unknown to the Java compiler. Therefore, nothing conforms to it ... not even <c>DataObject</c> itself! Using the example above, we can recap the different approaches to using generics in Java: <ul> Using <c>List<dataobject></c> as the formal argument doesn't allow us to pass a <c>List</c> Using <c>List</c> as the formal argument allows us to use only those functions defined in <c>java.lang.Object</c> on elements of the list. Using <c>List</c> allows us to pass any list of elements whose type conforms to <c>DataObject</c>, but limits the methods that can be called on it. </ul> <h>Making It Work</h> Let's return now to our original example and see if we can't apply our new-found knowledge to find a solution. Let's redefine the result type of the <c>getSubObjects()</c> function to use a wildcard, while leaving the result type of the <c>getAs()</c> function, defined in <c>B</c>, as it was. <code> public List<<span class="highlight">? extends </span>DataObject> getSubObjects() { return subObjects; } </code> However, as we saw in the third case above, this return type uses an unknown (unknowable) generic type and cannot be modified using <c>add()</c> or <c>remove()</c>. Not exactly what we were looking for. Let's instead put it back the way it was and concentrate on using our newfound knowledge to <i>cast</i> (Yay! Casting! I knew you'd be back!) our result to the correct type. Here's a naive attempt: <code> public List getAs() { return <span class="highlight">(List)</span> getSubObjects(); } </code> Ok. From the discussion above, it's clear this won't work and the compiler rewards us with the following error message: <div class="error">Cannot cast from <c>List<dataobject></c> to <c>List</c></div> Fine, let's try again, this time throwing a wildcard into the mix: <code> public List getAs() { return (List) <span class="highlight">(List< ? extends PathElement>)</span> getSubObjects(); } </code> Sweet! It compiles! We're definitely on the home stretch now, but there's still a warning from the compiler: <div class="caution">Type safety: the cast from <c>List<capture-of></c> to <c>List</c> is actually checking against the erased type list.</div> This is Java's way of saying that you have done a complete end-run around it's type-checking. The <iq>erased type list</iq> is actually <c>List</c> because the compiler uses a strategy called <i>erasure</i><fn> to resolve generic references. The double cast in the example above compiles (and will run), but cannot be statically checked. At this point, there's nothing more we can do, so we admit defeat the Java way and slap a <c>SuppressWarnings</c> annotation on the function and continue on our way. <code> <span class="highlight">@SuppressWarnings("unchecked")</span> public List getAs() { return (List) (List< ? extends PathElement>) getSubObjects(); } </code> It's clear that the decision to avoid covariance at all costs has cost the language dearly in terms of expressiveness (and, as a result, type-safety, as evidenced by the casting in the final example). It takes rather a lot of illegible code to express what, at the beginning of the article, seemed a rather simple concept. <hr> <ft>Since the Pascal days, it seems that popular, mainstream languages almost always decide for compiler simplicity over programmer expressiveness. <a href="http://earthli.com/news/view_article.php?id=820">Static-typing for languages with covariant parameters</a> offers a more in-depth example of covariance. For more information on this issue and other ways of addressing it---without putting the burden on the programmer---see the paper, <a href="http://www.inf.ethz.ch/~meyer/ongoing/covariance/recast.pdf" title="Type-safe covariance: Competent compilers can catch all cat-calls">Type-safe covariance</a>, which offers both an in-depth look at the "problem" of covariance and offers a concrete solution (which has been since implemented in Eiffel).</ft> <ft>This quick overview on <a href="http://java.sun.com/docs/books/tutorial/java/generics/erasure.html" source="Sun">Type Erasure</a>, explains the concept. The reason for this relatively naive implementation of generics is---as almost always in Java---backwards compatibility: <iq>so that new code may continue to interface with legacy code</iq></ft> <n>Using Java 1.5</n>