The annotation web standard

Published by marco on

The Web Annotation Working Group has published [t]hree recommendations to enable annotations on the web.

What does that mean for you, as a consumer of the Internet?

This standard will bring commenting and conversation and annotation into the 21st century. It will possibly bring order to the myriad systems and accounts and formats currently in place that allow for interaction and discussion.

The diagram Web Annotation Architecture (W3C) (SVG) is interactive and steps you through the whole concept. It’s quite easy to follow and worth your while if you participate in conversations on-line.

 Annotation Guide (SVG)

What this annotation standard doesn’t do is address the problem that most people are without facts, have little to no reasoning capacity, have big, dumb opinions, and even bigger mouths. No-one can fix that.

Separating annotations from content

When you read an article online, you currently see comments only from that site’s annotation system. While the site owner can moderate the comments, you cannot. If there are 4000 garbage-filled comments, you have to plow through them if you want to be involved in the discussion. There are various mechanisms for voting and rating to try to get useful content more attention, but it’s still often a burden.

The standard aims to put more power into the hands of the consumer by allowing users to display annotations that come from other sources. The idea is not to have a centralized provider (as it is now) but to allow a user to choose which annotation providers they want to enable on their browsers or on certain sites.

Under Your Control

Imagine if you had an annotation provider of just you and a circle of trusted confederates. When you went to a site, instead of seeing thousands of annotations from the unmoderated horde, you could elect to see annotations made only by members you trust.

Not only that but the disparate and ad-hoc annotation systems, each with different authentication and authorization requirements as well as formats could now be consolidated by the user (at least to some degree).

You could maybe even pay a few bucks a month to house your own annotations and re-publish them to other systems, like Facebook or Twitter. If those sites also provided annotations support, then you’d have your own copy of your annotations, but other people would see your annotations if they used the more popular providers from Facebook or Twitter. You’d have the best of both worlds.

If you use a Kindle reader or Instapaper (or both, like me), then you already have two annotation systems available. The Kindle stores to a text file, while Instapaper stores to a proprietary store online. Now imagine if your annotations for these systems were standardized and stored separately from both sites. You would make notes on the Kindle and they would be synced to the same place as notes made while visiting the page online or while reading in Instapaper. This sounds very good to me, at any rate, or probably to anyone with an academic bent.

Available Separately

The nice thing about this standard is that there is no need for buy-in on the part of the content providers.

Although a content provider can provide integration, a user is free to enable one or more external annotation systems. The user is free to do that today but the new standard provides a way of interaction between them that would allow this whole market to gain momentum.

If this idea were to spread, then annotations would become a separate cottage industry, with its own market and business model. Online news providers would no longer be obligated to waste time moderating and upgrading and protecting their own annotation/commenting systems. They could focus their time, effort and resources on journalism—or whatever passes for journalism at the New York Times.

With this separation, a user could pick content separately from the quality of the annotations. There would no loner be a notion of a siite’s content being good, but the community bad. If you don’t like the community, switch annotation providers, seek one out with better moderation. Imagine if, instead of having YouTube commenters annotation videos, you could enable (for example) the annotations from /r/truefilm or /r/movies on Reddit instead. If Reddit provided annotations support to this standard, then this could happen automatically whenever you browsed to a YouTube video for which there was a Reddit thread on one of those sub-reddits.

A common UI is easier to use and manage.

Another issue to address is that the annotations show up only at the end of the article, with no way of referencing a location in the content. Some annotation systems don’t even have threading, so comments are just stacked with no relation to one another. A common system would provide the same powerful UI regardless of what the content provider was willing to include in the web-site software.

There is the issue of scale: what if the original text is overwhelmed by annotations? That shouldn’t be a problem as you can turn annotations off at any time. If you have multiple sources, you can toggle them individually. Also, most annotations are replies to other annotations, so the number of root-level annotations is much smaller than the total and shouldn’t ever crowd out the content.

The threaded nature of annotations combined with control over annotation sources should ameliorate these issues. It’s certainly more power than you have today.[1]

Technical details

The working group has now officially been replaced by the Open Annotation Community Group. With this announcement, the data model, protocol and vocabulary have been approved by all stakeholders.

  • Data model: “JSON format for ease of creation and consumption of annotations based on the conceptual model that accommodates these use cases”
  • Protocol: “describes the transport mechanisms for creating and managing annotations in a method that is consistent with the Web Architecture and REST best practices”
  • Vocabulary: “the set of RDF classes, predicates and named entities that are used by the Web Annotation Data Model”

That this concept has been standardized is a very big deal if we can actually make it happen. The standard would allow not only vendors to develop their own systems but browsers to provide native and fast implementations that use a common UI language in the browser itself.

A working implementation

If you click through to the article Annotation is now a web standard (Hypothesis), you can see a non-standard annotation system in action. It looks quite nice. The article describes the advantages of consolidation as well, cited below.

“While many applications, from PDF readers and Google Docs to the Kindle, support some kind of annotation functionality, what the W3C formalized yesterday is fundamentally different. The W3C architecture provides for a model where annotations live separately from documents and are reunited and reanchored in real-time whenever the relevant document is present. The benefit of this is that annotations now come under the control and election of the user, rather than at the sole discretion of the publisher.”

The existing plugin in the linked article already exists and looks very nice and subdued. The diagram below shows the initial view.

 Embedding Annotations

Annotated text is highlighted, but you can toggle its display (with the “eye” icon on the right) for distraction-free reading. Controls on the side allow you to toggle the annotations panel and also show you a preview of the number of annotations available so far.

Scrolling down shows more annotations, below.

 Annotations Visualization

As you scroll, you can see how many annotations are available above and below the current point. These indicators help you locate areas that are being heavily discussed or annotated. Of course, using these to navigate without reading the whole article leaves you susceptible to manipulation, but it can still be a good feature if used wisely.[2]

If you select an annotation in the text or the scrollbar or by clicking the panel-toggle, the panel opens on the right.

 Annotations Panel

Having annotations to the right (or left) is a standard practice in much professional review or annotation software or printing (see Microsoft Word) and makes much more sense than below an article or paper.

Annotations are generally made in reference to a specific location in the text and should be in-line, so reader have context. This implementation does that and shows how browser vendors could build on this standard to improve the commenting experience considerably. The version shown above is clean and simple and powerful and enhances the article, as annotations should.

As content grows, the system grows with it, as shown below.

 Threaded Replies

Threaded replies are not shown by default, but can be shown with a click. Each annotation has its own unique identifier and can be shared and perma-linked. A system can use color hints—or maybe icons or gravatars or subtle name-tags—to allow the reader to differentiate annotations by user.

I think this is a great advancement and look forward to its adoption and expansion.


See Yahoo Answers, YouTube or any of the dozens of other examples of comment-cesspools that are all but useless today. It can be rough out there, for both participants and site moderators. As the article Ten Years After by Scott H. Greenfield (Simple Justice) put it,

“It’s not nearly as much fun to write about the law when readers are nuts. It’s even less fun fending off the insane comments, here and by the geniuses on other social media, from reddit to the twitters, as if it’s my duty to explain why they suffer from paranoid delusions and pathological narcissism.”
[2] Nothing was stopping a troll from skipping the entire text and commenting without reading before. That is unchanged.