Waiting for C# 4.0: A casting problem in C# 3.5

Published by marco on

Updated by marco on

C# 3.5 has a limitation where generic classes don’t necessarily conform to each other in the way that one would expect. This problem manifests itself classically in the following way:

class D { }
class E : D { }
class F : D { }

class Program
{
  void ProcessListOfD(IList<D> list) { }
  void ProcessListOfE(IList<E> list) { }
  void ProcessSequenceOfD(IEnumerable<D> sequence) { }
  void ProcessSequenceOfE(IEnumerable<E> sequence) { }

  void Main()
  {
    var eList = new List<E>();
    var dList = new List<D>();

    ProcessListOfD(dList); // OK
    ProcessListOfE(dList); // Compiler error, as expected
    ProcessSequenceOfD(dList); // OK
    ProcessSequenceOfE(dList); // Compiler error, as expected

    ProcessListOfD(eList); // Compiler error, unexpected!
    ProcessListOfE(eList); // OK
    ProcessSequenceOfD(eList); // Compiler error, unexpected!
    ProcessSequenceOfE(eList); // OK
  }
}

Why are those two compiler errors unexpected? Why shouldn’t a program be able to provide an IList<E> where an IList<D> is expected? Well, that’s where things get a little bit complicated. Whereas at first, it seems that there’s no down side to allowing the assignment—E can do everything expected of D, after all—further investigation reveals a potential source of runtime errors.

Expanding on the example above, suppose ProcessListOfD() were to have the following implementation:

void ProcessListOfD(IList<D> list)
{
  if (SomeCondition(list))
  {
    list.Add(new F());
  }
}

With such an implementation, the call to ProcessListOfD(bList), which passes an IList<E> would cause a runtime error if SomeCondition() were to return true. So, the dilemma is that allowing co- and contravariance may result in runtime errors.

A language design includes a balance of features that permit good expressiveness while restricting bad expressiveness. C# has implicit conversions, but requires potentially dangerous conversions to be made explicit with casts. Similarly, the obvious type-compatibility outlined in the first example is forbidden and requires a call to the System.Linq.Enumerable.Cast<T>(this IEnumerable) method instead. Other languages—most notably Eiffel—have always allowed the logical conformance between generic types, at the risk of runtime errors.[1]

Some of these limitations will be addressed in C# 4.0 with the introduction of covariance. See Covariance and Contravariance (C# and Visual Basic) (MSDN) and LINQ Farm: Covariance and Contravariance in C# 4.0 for more information.

A (Partial) Solution for C# 3.5

Until then, there’s the aforementioned System.Linq.Enumerable.Cast<T>(this IEnumerable) method in the system library. However, that method, while very convenient, makes no effort to statically verify that the input and output types are compatible with one another. That is, a call such as the following is perfectly legal:

var numbers = new [] { 1, 2, 3, 4, 5 };
var objects = numbers.Cast< object>(); // OK
var strings = numbers.Cast< string>(); // runtime error!

Instead of an unchecked cast, a method with a generic constraint on the input and output types would be much more appropriate in those situations where the program is simply avoiding the generic-typing limitation described in detail in the first section. The method below does the trick:

public static IEnumerable<TOutput> Convert<TInput, TOutput>(this IEnumerable<TInput> input)
  where TInput : TOutput
{
  if (input == null) { throw new ArgumentNullException("input"); }

  if (input is IList<TOutput>) { return (IList<TOutput>)input; }

  return input.Select(obj => (TOutput)(object)obj);
}

While it’s entirely possible that the Cast() function from the Linq library is more highly optimized, it’s not as safe as the method above. A check with Redgate’s Reflector would probably reveal just how that method actually works. Correctness come before performance, but YMMV.[2]

The initial examples can now be rewritten to compile without casting:

ProcessListOfD(eList.Convert<E, D>()); // OK
ProcessListOfE(eList); // OK
ProcessSequenceOfD(bList.Convert<E, D>()); // OK
ProcessSequenceOfE(eList); // OK

One More Little Snag

Unlike the Enumerable.Cast<TOutput>() method, which has no restrictions and can be used on any IEnumerable, there will be places where the compiler will not allow an application to use Convert<TOutput>(). This is because the generic constraint to which TOutput must conform (TInput) is, in some cases, not statically provable (i.e. at compile-time). A concrete example is shown below:

abstract class A
{
  abstract IList<TResult> GetObject<TResult>();
}

class B<T> : A
{
  public override IList<TResult> GetObject<TResult>() 
  {
    return _objects.Convert<T, TResult>(); // Compile error!
  }

  private IList<T> _objects;
}

The example above does not compile because TResult does not provably conform to T. A generic constraint on TResult cannot be applied because it would have to be applied to the original, abstract function, which knows nothing of T. In these cases, the application will be forced to use the System.Linq.Enumerable.Cast<T>(this IEnumerable) instead.

[1] I’ve addressed this issue before in Static-typing for languages with covariant parameters, which reviewed the paper, Type-safe covariance: Competent compilers can catch all catcalls, a proposal for statically identifying potential runtime errors and requiring them to be addressed with a recast definition. Similarly, another runtime plague—null-references—is also addressed in Eiffel, a feature extensively documented in the paper, Attached types and their application to three open problems of object-oriented programming.↩

[2] YMMV = “Your Mileage May Vary”, but remember, Donald Knuth famously said: “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.”↩