fromSeptember 2014

Web API Alphabet Soup

Cooking With Acronyms

Blocks Drupal 8 offers unprecedented support for creating RESTful APIs. This was one of the major goals of the Web Services and Context Core Initiative (WSCCI), and one we've delivered on pretty well. However, as with most things that are worth doing, just because Drupal core “supports” it doesn't mean you'll see good results without an understanding of what's going on. In this article, we'll explore some of these principles, so that when it comes time to design with those systems, you'll know how to think about the problem.


Drupal 8 ships with support for encoding and representing its entities (and other objects) via the Hypermedia Application Language (HAL) specification. HAL can currently be expressed in JSON or XML, and is a specification for describing resources. As the specification says, HAL is “a bit like HTML for machines.”

What that means is that a HAL API can provide enough data for a machine agent to visit the root ("/") of a website, then navigate its way through the remainder of the system exclusively by following the links provided in responses. Humans do the exact same thing by visiting a page and clicking on links. The notion that machines might also want to do this is a relatively obvious idea, but one that has, until recently, rarely been followed on the web.

Still, though, it's pretty abstract. To really understand why HAL is powerful – and what it does for us in Drupal – it's necessary to go back to the basic constraints and capabilities of the problem space it operates in: HTTP and REST. The crucial documents there are RFC2616 and Roy Fielding's thesis, both well-worth [re-]reading. But a more easily digestible version comes in the form of the Richardson Maturity Model, first laid out by Leonard Richardson in 2008, and since revisited by Martin Fowler and Steve Klabnik.


The Richardson Maturity Model helpfully suggests a set of four “maturity” levels into which HTTP APIs fall:

  • Level 0, The Swamp of POX Your service is nothing more than a few RPC passthrough endpoints; the use of HTTP is incidental.
  • Level 1, Resources The passthrough endpoints are broken down into unique URIs that identify resources.
  • Level 2, HTTP Verbs The interaction with resources is further refined through the correct use of the full set of HTTP’s verbs, rather than simply GET and POST.
  • Level 3, Hypermedia Controls Media types capture a resource’s representational form(s), and hyperlinks are included for navigation.

These ideas have already been generally explored quite well in the previously-linked articles, and I don't want to just duplicate that work. Instead, I'll aim for a brief, illustrative treatment, so that we can spend more time focusing on the implications for Drupal.

Level 0 – The Swamp of POX

At this level, HTTP is being used essentially as a tunnel for an RPC service. Many calls are made to individual endpoints, with purely application-defined restrictions on what sort of work that endpoint can perform. POX refers to “Plain Old XML,” reflecting that there are essentially no boundaries on what sort of data is either sent into, or returned from, such a system.

In short, it's the Wild West.

Level 1 – Resources

The most glaring issue with our generic service endpoint is that it lumps together a bunch of different, potentially unrelated things under a single endpoint. In non-HTTP APIs, this may be less of an issue, but HTTP and the web are grounded in the idea of Uniform Resource Identifiers (URIs), which identify individual resources. Thus, the requirement for the first level is representing our system as a set of resources, and assigning a unique URI to each of them.

Of course, that begs the question, what is a “resource”?

Well, there's some wiggle room. As we'll see later, Drupal 8’s HAL output is primarily built around representing individual entities as resources, and that correlation generally works well. But what about lists of entities – what about a View?

That’s where things get interesting. I just described resources as “individual things,” which would seem to suggest that a View can't be a resource. Not so. Lists are still resources, but their relevant properties (not including the list items themselves) are things like, say, sort order. If sort order seems trivial or incidental, consider its role for a site like Reddit, where sort order is how the social apparatus is expressed.

All of this is pretty old hat; most systems – Drupal included – have been using URIs for a long time. Level 2 may be slightly less familiar.

Level 2 – HTTP Verbs

Having URIs for all our resources is great, but there's an immediate problem: how do I interact with those resources in different ways, for example to perform Create/Read/Update/Delete (CRUD) operations? This is the domain of HTTP methods, often called verbs.

Much as resources were a refinement of POX, verbs refine resources. Instead of having to create separate URIs for a resource in order to perform different actions on it, we access it using the same URL, but with a different HTTP verb to describe our intent. To get a representation of the resource, we use GET; to delete it, we use DELETE, and to update or create it, we use PATCH, PUT, or POST. (The differences between those last three are important, but outside the scope of this article.)

Resources should have a single URI: it’s an effective way to encode more information onto a single request, allowing us to better the original guideline. There is ongoing debate as to the efficacy of using all the verbs “correctly,” but it's still worth understanding the intent.

Level 3 – Hypermedia Controls

The fourth level encompasses two ideas: different representations of resources via media types, and linking between resources. The web generally adheres decently enough to the first two levels, but it's pretty awful on this one.

Media types are another refinement on the notion of a resource. They allow the client to request – and/or the server to respond with – varying representations of a resource. A simple example is HTML vs. JSON. The client can request a particular content type using the Accept HTTP request header, and the server indicates the type of its response with the Content-Type HTTP response header. HTML is text/html and JSON is application/json.

But JSON is really just a raw data serialization format, not a document type like HTML. It can't be turned into something meaningful without additional context that specifies how to interpret it. Not all media types are created equal. An excellent example of how this issue crops up is the other major component of Level 3 – hyperlinks.

Links are easy to understand; we've been using them forever in HTML. They're a pointer from the current resource to another resource. Without them, you'd have to input URLs by hand into the address bar, copying them from a separate document. Ridiculous! And yet, that's exactly what we expect API implementers to do: interact with our REST APIs by copying in URIs from a specifications document.

But if the server instructs REST clients on how to construct URIs for the system by providing links to other resources in its responses, then, as with human users and HTML, clients can navigate their own way through. So long as all resources are linked together somehow, clients will be able to reach them all.

In fact, actually providing the links is a bit of a trick. The referent itself is easy – it's just a URI. The question is, how does the REST client know to interpret those URIs as links?

For example, here's text/html which knows how to handle links just fine:

  <title>It's Jamal's Birthday!</title>
    <p>Happy birthday, <a href="">Jamal!</a>

And here's a representation of roughly the same resource in application/json:

    "title": "It's Jamal's Birthday!",
    "body": "Happy birthday, Jamal!",
    "????": ""

What named key should the client look for to know that its values are hyperlinks?

Sure, you can make one up for your own purposes, but then it's application-specific. And besides, most of us are in the business of creating applications, not defining what a link is. Which brings us back to HAL.

HAL is a generic spec for hypermedia applications. It provides an answer for this link-property-naming question (and many others). If your HTTP API produces compliant HAL data and reports it with the media type application/hal+json (or application/hal+xml), then it's immediately navigable by any generic HAL client. In fact, hal-browser provides a web UI for browsing HAL trees. (You can experiment with one online.)

This also makes HAL APIs largely self-describing; as there's less need for API docs when clients can just walk the API – although HAL even defines a standard way to link to a resource's documentation!


So, we've reached Level 3 – now what? HATEOAS, that's what. HATEOAS stands for Hypertext As The Engine Of Application State. Pronounce it like Cheerios on a bad day.

Essentially, HATEOAS is the functional result of achieving Level 3 maturity. If you have good, clean resources at unique URIs that fully represent the entire state of your system, and they can be manipulated through a sufficient suite of HTTP methods, and are variously representable through appropriate media types, then it's possible to drive (i.e., state transitions in) the application entirely through hypertext interactions. Which is to say, every state transition – roughly speaking, CRUD operation – can be driven through a series of steps that looks something like this:

  1. Start from either the root or a known URI.
  2. Walk through the links provided in responses to GET requests.
  3. Once you arrive at the entity you'd like to manipulate, follow a link – possibly a self-link – to the appropriate URI and send a request using the appropriate method (PUT, PATCH, DELETE, or POST) for your desired state change.

All of this should be familiar; it's how we use a browser.

Achieving HATEOAS for your entire application means you're pretty much 100% RESTful. And that’s a lovely goal...right?

Well, probably.

REST and the resource model is not necessarily appropriate for all types of applications. And, after all, one could design an API that's probably just as capable using a POXy, Level 0 approach. Certainly, plenty of other RPC endpoint-style protocols work. So, why care at all?

For me, it's an issue of design. REST was designed in a way that fits very nicely into the constraints of HTTP. It scales better into complexity – something that Drupal projects often desperately need. If your application is appropriate for REST, then not aspiring to HATEOAS for your HTTP API is sort of like sending Morse code over video chat by toggling your camera on and off: you're not using the medium as intended, and your utility will suffer for it.

HAL, HATEOAS, and Drupal 8

Drupal 8 ships with three modules that, together, seek to allow Drupal to act as a hypermedia API via HAL:

  1. serialization The serialization module is built atop Symfony's serialization component. It provides interfaces for (surprise!) serializing different types of data into strings, and deserializing them back into data.
  2. rest The REST module provides a framework for expressing Drupal's data as web resources.
  3. hal Building on the previous two, it facilitates HAL encoding of Drupal's data types to create application/hal+json output for HTTP responses.

These modules are not enabled by default, but if you turn them on, Drupal will start serving HAL for entities. By default, these are only available to authenticated users. You can either enable the basic authentication module to allow clients to authenticate as part of their request, or simply open up the permissions. Note that currently, there is only support for relaying content entities (as opposed to, say, config, menu, or user entities) via REST.

Now the basic pieces are in place, but there's still work to be done. For example, Views can generate a HAL representation of their output, but it's not automatic in quite the way entities are; you have to configure a separate display. This starts to get into murky territory with respect to REST's ideal expectations, as a separate display likely entails a separate resource (though not necessarily). Media types are only supposed to be a way of providing a different representation of the same resource, not a maybe-the-same-but-maybe-not resource.

A similar but more intractable problem exists with entities: if you enable field-level access controls, it becomes impossible to guarantee the same resource at a given URI, as it is now user-specific. To remain compliant with the ideals of REST, your response would have to encode per-user information into the media type, as in:
Content-Type: application/hal+json; user=<username>.

This is a classic Drupal challenge: it doesn't prevent you from making standards-noncompliant and possibly poor decisions. That’s why an understanding of the underlying principles at work is so important; without it, you won't know even know the tradeoffs you're facing, and it's easy to find yourself off the path, wandering the woods of pseudo-maintainability.

In a way, though, this is “by design.” Roy Fielding believes that:

A REST API should spend almost all of its descriptive effort in defining the media type(s) used for representing resources and driving application state, or in defining extended relation names and/or hypertext-enabled mark-up for existing standard media types.

While HAL relieves us of much of the basic drudgery around specifying a media type, it cannot solve the whole problem. Determining what links should exist between resources, and the types of those links, is a crucial design choice that remains in the hands of the implementer – as it should. That is the design half of the challenge. The other half will be learning what possibilities Drupal 8 allows that you should not use in order to remain as standards-friendly as possible.