This page shows the source for this entry, with WebCore formatting language tags and attributes highlighted.
Title
Waiting for C# 4.0: A casting problem in C# 3.5
Description
C# 3.5 has a limitation where generic classes don't necessarily conform to each other in the way that one would expect. This problem manifests itself classically in the following way:
<code>
class D { }
class E : D { }
class F : D { }
class Program
{
void ProcessListOfD(IList list) { }
void ProcessListOfE(IList list) { }
void ProcessSequenceOfD(IEnumerable sequence) { }
void ProcessSequenceOfE(IEnumerable sequence) { }
void Main()
{
var eList = new List();
var dList = new List();
ProcessListOfD(dList); // OK
ProcessListOfE(dList); // Compiler error, as expected
ProcessSequenceOfD(dList); // OK
ProcessSequenceOfE(dList); // Compiler error, as expected
ProcessListOfD(eList); // Compiler error, <hl>unexpected!</hl>
ProcessListOfE(eList); // OK
ProcessSequenceOfD(eList); // Compiler error, <hl>unexpected!</hl>
ProcessSequenceOfE(eList); // OK
}
}
</code>
Why are those two compiler errors unexpected? Why shouldn't a program be able to provide an <c>IList</c> where an <c>IList</c> is expected? Well, that's where things get a little bit complicated. Whereas at first, it seems that there's no down side to allowing the assignment---<c>E</c> can do everything expected of <c>D</c>, after all---further investigation reveals a potential source of runtime errors.
Expanding on the example above, suppose <c>ProcessListOfD()</c> were to have the following implementation:
<code>
void ProcessListOfD(IList list)
{
if (SomeCondition(list))
{
list.Add(new F());
}
}
</code>
With such an implementation, the call to <c>ProcessListOfD(bList)</c>, which passes an <c>IList</c> would cause a runtime error if <c>SomeCondition()</c> were to return <c>true</c>. So, the dilemma is that allowing co- and contravariance <i>may</i> result in runtime errors.
A language design includes a balance of features that permit <i>good</i> expressiveness while restricting <i>bad</i> expressiveness. C# has implicit conversions, but requires potentially dangerous conversions to be made explicit with casts. Similarly, the obvious type-compatibility outlined in the first example is forbidden and requires a call to the <c>System.Linq.Enumerable.Cast<t>(this IEnumerable)</c> method instead. Other languages---most notably <a href="http://www.eiffel.com/">Eiffel</a>---have always allowed the logical conformance between generic types, at the risk of runtime errors.<fn>
Some of these limitations will be addressed in C# 4.0 with the introduction of covariance. See <a href="http://msdn.microsoft.com/en-us/library/dd233054(VS.100).aspx" source="MSDN">Covariance and Contravariance (C# and Visual Basic)</a> and <a href="http://blogs.msdn.com/charlie/archive/2008/10/28/linq-farm-covariance-and-contravariance-in-visual-studio-2010.aspx">LINQ Farm: Covariance and Contravariance in C# 4.0</a> for more information.
<h>A (Partial) Solution for C# 3.5</h>
Until then, there's the aforementioned <c>System.Linq.Enumerable.Cast<t>(this IEnumerable)</c> method in the system library. However, that method, while very convenient, makes no effort to statically verify that the input and output types are compatible with one another. That is, a call such as the following is perfectly legal:
<code>
var numbers = new [] { 1, 2, 3, 4, 5 };
var objects = numbers.Cast< object>(); // OK
var strings = numbers.Cast< string>(); // <hl>runtime error!</hl>
</code>
Instead of an unchecked cast, a method with a generic constraint on the input and output types would be much more appropriate in those situations where the program is simply avoiding the generic-typing limitation described in detail in the first section. The method below does the trick:
<code>
public static IEnumerable<toutput> Convert<tinput,>(this IEnumerable<tinput> input)
where TInput : TOutput
{
if (input == null) { throw new ArgumentNullException("input"); }
if (input is IList<toutput>) { return (IList<toutput>)input; }
return input.Select(obj => (TOutput)(object)obj);
}
</code>
While it's entirely possible that the <c>Cast()</c> function from the Linq library is more highly optimized, it's not as safe as the method above. A check with Redgate's <a href="http://www.red-gate.com/products/reflector/">Reflector</a> would probably reveal just how that method actually works. Correctness come before performance, but YMMV.<fn>
The initial examples can now be rewritten to compile without casting:
<code>
ProcessListOfD(<hl>eList.Convert()</hl>); // OK
ProcessListOfE(eList); // OK
ProcessSequenceOfD(<hl>bList.Convert()</hl>); // OK
ProcessSequenceOfE(eList); // OK
</code>
<h>One More Little Snag</h>
Unlike the <c>Enumerable.Cast()</c> method, which has no restrictions and can be used on any <c>IEnumerable</c>, there will be places where the compiler will not allow an application to use <c>Convert()</c>. This is because the generic constraint to which <c>TOutput</c> must conform (<c>TInput</c>) is, in some cases, not statically provable (i.e. at compile-time). A concrete example is shown below:
<code>
abstract class A
{
abstract IList<tresult> GetObject<tresult>();
}
class B<t> : A
{
public override IList<tresult> GetObject<tresult>()
{
return _objects.Convert<t,>(); // <hl>Compile error!</hl>
}
private IList<t> _objects;
}
</code>
The example above does not compile because <c>TResult</c> does not provably conform to <c>T</c>. A generic constraint on <c>TResult</c> cannot be applied because it would have to be applied to the original, abstract function, which knows nothing of <c>T</c>. In these cases, the application will be forced to use the <c>System.Linq.Enumerable.Cast<t>(this IEnumerable)</c> instead.
<hr>
<ft>I've addressed this issue before in <a href="{app}view_article.php?id=820">Static-typing for languages with covariant parameters</a>, which reviewed the paper, <a href="http://data.earthli.com/news/attachments/entry/820/recast.pdf">Type-safe covariance: Competent compilers can catch all catcalls</a>, a proposal for statically identifying potential runtime errors and requiring them to be addressed with a <c>recast</c> definition. Similarly, another runtime plague---<c>null</c>-references---is also addressed in Eiffel, a feature extensively documented in the paper, <a href="http://se.ethz.ch/~meyer/publications/lncs/attached.pdf">Attached types and their application to three open problems of object-oriented programming</a>.</ft>
<ft>YMMV = "Your Mileage May Vary", but remember, Donald Knuth famously said: <iq>We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.</iq></ft>