ABD: Refactoring and refining an API

Published by marco on

Updated by marco on

We’ve been doing more internal training at Encodo lately and one topic that we’ve started to tackle is design for architecture/APIs. Even if you’re not officially a software architect—designing and building entire systems from scratch—every developer designs code, on some level.

[A]lways
[B]e
[D]esigning

There are broad guidelines about how to format and style code, about how many lines to put in a method, about how many parameters to use, and so on. We strive for Clean Code™.

But the most important part of code is to think about how you’re writing it and what you’re building. You shouldn’t write a single line without thinking of the myriad ways in which it must fit into existing code and the established patterns and practices.

We’ve written about this before, in the two-part series called “Questions to consider when designing APIs” (Part I and Part II). Those two articles comprise a long list of aspects of a design to consider.

First make a good design, then compromise to fit project constraints.

Your project defines the constraints under which you can design. That is, we should still have our designer caps on, but the options available are much more strictly limited.

But, frustrating as that might be, it doesn’t mean you should stop thinking. A good designer figures out what would be optimal, then adjusts the solution to fit the constraints. Otherwise, you’ll forget what you were compromising from—and your design skills either erode or never get better.

We’ve been calling this concept ABD—Always Be Designing.[1] Let’s take a closer, concrete look, using a recent issue in the schema migration for Quino. Hopefully, this example illustrates how even the tiniest detail is important.[2]

A bug in the schema migrator

We detected the problem when the schema migration generated an invalid SQL statement.

ALTER TABLE "punchclock__timeentry" ALTER COLUMN "personid" SET DEFAULT ;

As you can see, the default value is missing. It seems that there are situations where the code that generates this SQL is unable to correctly determine that a default value could not be calculated.

The code that calculates the default value is below.

result = Builder.GetExpressionPayload(
  null,
  CommandFormatHints.DefaultValue,
  new ExpressionContext(prop),
  prop.DefaultValueGenerator
);

To translate, there is a Builder that produces a payload. We’re using that builder to get the payload (SQL, in this case) that corresponds to the DefaultValueGenerator expression for a given property, prop.

This method is an extension method of the IDataCommandBuilder, reproduced below in full, with additional line-breaks for formatting:

public static string GetExpressionPayload<TCommand>(
  this IDataCommandBuilder<TCommand> builder,
  [CanBeNull] TCommand command,
  CommandFormatHints hints, 
  IExpressionContext context,
  params IExpression[] expressions)
{
  if (builder == null) { throw new ArgumentNullException("builder"); }
  if (context == null) { throw new ArgumentNullException("context"); }
  if (expressions == null) { throw new ArgumentNullException("expressions"); }

  return builder.GetExpressionPayload(
    command,
    hints,
    context,
    expressions.Select(
      e => new ExecutableQueryItem<IExecutableExpression>(new ExecutableExpression(e))
    )
  );
}

This method does no more than to package each item in the expressions parameter in an ExecutableQueryItem and call the interface method.

The problem isn’t immediately obvious. It stems from the fact that each ExecutableQueryItem can be marked as Handled. The extension method ignores this feature, and always returns a result. The caller is unaware that the result may correspond to an only partially handled expression.

Is there a quick fix?

Our first instinct is, naturally, to try to figure out how we can fix the problem.[3] In the code above, we could keep a reference to the executable items and then check if any of them were unhandled, like so:

var executableItems = expressions.Select(
  e => new ExecutableQueryItem<IExecutableExpression>(new ExecutableExpression(e))
);
var result = builder.GetExpressionPayload(command, hints, context, executableItems);

if (executableItems.Unhandled().Any())
{
  // Now what?
}

return result;
}

We can detect if at least one of the input expressions could not be mapped to SQL. But we don’t know what to do with that information.

Do we throw an exception? No, we can’t just do that. None of the callers are expecting an exception, so that’s an API change.[4]
Do we return null? What can we return to indicate that the input expressions could not be mapped? Here we have the same problem as with throwing an exception: all callers assume that the result can be mapped.

So there’s no quick fix. We have to change an API. We have to design.

Part of the result is missing

As with most bugs, the challenge lies not in knowing how to fix the bug, but in how to fix the underlying design problem that led to the bug. The problem is actually not in the extension method, but in the method signature of the interface method.

Instead of a single result, there are actually two results for this method call:

Can the given expressions be mapped to a string (the target representation)?
If so, what is that text?

Instead of a Get method, this is a classic TryGet method.

How to Introduce the Change

If this code is already in production, then you have to figure out how to introduce the bug fix without breaking existing code. If you already have consumers of your API, you can’t just change the signature and cause a compile error when they upgrade. You have to decorate the existing method with [Obsolete] and make a new interface method.

So we don’t change the existing method and instead add the method TryGetExpressionPayload() to IDataCommandBuilder.

What are the parameters?

Now, let’s figure out what the parameters are going to be.

The method called by the extension method above has a slightly different signature.[5]

string GetExpressionPayload(
  [CanBeNull] TCommand command, 
  CommandFormatHints hints,
  [NotNull] IExpressionContext context,
  [NotNull] IEnumerable<ExecutableQueryItem<IExecutableExpression>> expressions
);

That last parameter is a bit of a bear. What does it even mean? The signature of the extension method deals with simple IExpression objects—I know what those are. But what are ExecutableQueryItems and IExecutableExpressions?

As an author and maintainer of the data driver, I know that these objects are part of the internal representation of a query as it is processed. But as a caller of this method, I’m almost never going to have a list of these objects, am I?

Let’s find out.

Me: Hey, ReSharper, how many callers of that method are there in the entire Quino source?
ReSharper: Just one, Dave.[6]

So, we defined an API with a signature that’s so hairy no-one calls it except through an extension method that makes the signature more palatable. And it introduces a bug. Lovely.

We’ve now figured out that our new method should accept a sequence of IExpression objects instead of ExecutableQueryItem objects.

How’s the signature looking so far?

bool TryGetExpressionPayload(
  [CanBeNull] TCommand command, 
  CommandFormatHints hints,
  [NotNull] IExpressionContext context,
  [NotNull] IEnumerable<IExpression> expressions,
  out string payload
);

Are We Done?

Not quite. There are two things that are still wrong with this signature, both important.

Fix the Result Type

One problem is that the rest of the IDataCommandBuilder<TCommand> deals with a generic payload type and this method only works for builders where the target representation is a string. The Mongo driver, for example, uses MongoStorePayload and MongoRetrievePayload objects instead of strings and throws a NotSupportedException for this API.

That’s not very elegant, but the Mongo driver was forced into that corner by the signature. Can we do better? The API would currently require Mongo to always return false because our Mongo driver doesn’t know how to map anything to a string. But it could map to one of the aforementioned object representations.

If we change the out parameter type from a string to an object, then any driver, regardless of payload representation, has at least the possibility of implementing this API correctly.

Fix parameters

Another problem is that the order of parameters does not conform to the code style for Encodo.

We prefer to place all non-nullable parameters first. Otherwise, a call that passes null as the first parameter looks strange. The command can be null, so it should move after the two non-nullable parameters. If we move it all the way to the end, we can even make it optional.
Also, primitives should come after the references. (So hints should be third.)
Also, semantically, the call is getting the payload for the expressions not the context. The first parameter should be the target of the method; the rest of the parameters provide context for that input.
The original method accepted params IExpression[]. Using params allows a caller to provide zero or more expressions, but it’s only allowed on the terminal parameter. Instead, we’ll accept an IEnumerable<IExpression>, which is more standard for the Quino library anyway.

The final method signature is below.

bool TryGetExpressionPayload(
  [NotNull] IEnumerable<IExpression> expressions,
  [NotNull] IExpressionContext context,
  CommandFormatHints hints,
  out object payload,
  [CanBeNull] TCommand command = default(TCommand)
);

Our API in Action

The schema migration called the original API like this:

result = Builder.GetExpressionPayload(
  null,
  CommandFormatHints.DefaultValue,
  new ExpressionContext(prop),
  prop.DefaultValueGenerator
);

return true;

The call with the new API—and with the bug fixed—is shown below. The only non-functional addition is that we have to call ToSequence() on the first parameter (highlighted). Happily, though, we’ve fixed the bug and only include a default value in the field definition if one can actually be calculated.

object payload;
if (Builder.TryGetExpressionPayload(
  prop.DefaultValueGenerator.ToSequence(),
  new ExpressionContext(prop),
  CommandFormatHints.DefaultValue,
  out payload)
)
{
  result = payload as string ?? payload.ToString();

  return true;
}

One More Design Decision…

A good rule of thumb is that if you find yourself explaining something in detail, it might still be too complicated. In that light, the call to ToSequence() is a little distracting.[7] It would be nice to be able to map a single expression without having to pack it into a sequence.

So we have one more design decision to make: where do we add that method call? Directly to the interface, right? But the method for a single expression can easily be expressed in terms of the method we already have (as we saw above). It would be a shame if every implementor of the interface was forced to produce this boilerplate.

Since we’re using C#, we can instead extend the interface with a static method, as shown below (again, with more line breaks for this article):

public static bool TryGetExpressionPayload<TCommand>(
  [NotNull] this IDataCommandBuilder<TCommand> builder, // Extend the builder
  [NotNull] IExpression expression,
  [NotNull] IExpressionContext context,
  CommandFormatHints hints,
  out object payload,
  [CanBeNull] TCommand command = default(TCommand)
)
{
  return builder.TryGetExpressionPayload(
    expression.ToSequence(),
    context,
    hints,
    out payload,
    command
  );
}

We not only avoided cluttering the interface with another method, but now a caller with a single expression doesn’t have to create a sequence for it[8], as shown in the final version of the call below.

object payload;
if (Builder.TryGetExpressionPayload(
  prop.DefaultValueGenerator,
  new ExpressionContext(prop),
  CommandFormatHints.DefaultValue,
  out payload)
)
{
  result = payload as string ?? payload.ToString();

  return true;
}

Conclusion

We saw in this post how we always have our designer/architect cap on, even when only fixing bugs. We took a look at a quick-fix and then backed out and realized that we were designing a new solution. Then we covered, in nigh-excruciating detail, our thought process as we came up with a new solution.

Many thanks to Dani for the original design and Sebastian for the review!

[1] This is a bit of a riff on ABC—Always Be Closing—as popularized by Alec Baldwin in the movie Glengarry Glen Ross.↩

[2] Also, understand that it took much longer to write this blog post and itemize each individual step of how we thought about the issue. In reality, we took only a couple of minutes to work through this chain of reasoning and come up with the solution we wanted. It was only after we’d finished designing that I realized that this was a good example of ABD.↩

[3] Actually, our first instinct is to make sure that there is a failing test for this bug. But, this article deals with how to analyze problems and design fixes, not how to make sure that the code you write is tested. That’s super-important, too, though, just so you know. Essential, even.↩

[4] Even though C# doesn’t include the exceptions thrown in the signature of a method, as Java does. Where the Java version is fraught with issues, see the “Recoverable Errors: Type-Directed Exceptions” chapter of Midori: The Error Model by Joe Duffy for a really nice proposal/implementation of a language feature that includes expected exceptions in the signature of a method.↩

[5] Which is why we defined the extension method in the first place.↩

[6] I’m fully aware that my name isn’t Dave. It’s just what ReSharper calls me. Old-school reference.↩

[7] This was pointed out, by the way, by a reviewer of this blog post and escaped the notice of both designers and the code-reviewer. API design is neither easy nor is it done on the first try. It’s only finished after multiple developers have tried it out. Then, you’ll probably be able to live with it.↩

[8] Most developers would have used new [] { expression }, which I think is kind of ugly.↩

Comments

#1 − Nice post!

Marc –