General musings about RESTful design that I'd like to write down. Feel free to comment.
For the love of ponies, stop trying to change HTTP to suit your world view.
The uniform request semantics of HTTP are generic for a reason -- it's hard to be uniform if you're not generic. If you want to describe or interpret something more specific, you'll need a media type to help you out.
The biggest problem with applying the Web Architecture for enterprise systems is that we have no broad community (to my knowledge) that's experimenting in media types to help solve problems that are common inside enterprises. That, I think, is a cultural and psychological limitation, of both the REST community and enterprise developers -- and not due to the Web Architecture being inappropriate for enterprise systems-of-systems interaction.
The main problem when you jump into the enterprise with the Web Architecture is the semantic interoperability problem. This is the same problem with other architectures, but with REST's benefits of easier interoperability, it becomes more important. We might be able to specify media types for infrastructure-level topics, like identity, authentication, or syndication, or content publishing, but we have much less success with, say, "Job Applications" or "Order Management", because those domains change faster than a media type should, and the semantics themselves are subject to disagreement across trust boundaries.
The takeway is not "and thus REST is useless in the enterprise". The point is that we have a lot of work to build media types and tools to bridge the gap between the communication actions that are uniform and the business actions that are non-uniform. See below for more on this.
REST is not CRUD.
PUT means "replace" but doesn't imply a byte-for-byte guarantee that what you send will be what you GET later. It means "here is my desired state for this URI, please replace it". It cannot be used to replace POST which means "process this please, impacting whatever resources you think make sense".
Furthermore, DELETE does not guarantee you will get a 404 Not Found or 410 Gone at the URI afterwards. There may still be a representation at that URI later!
HTTP is an abstract interface about describing agent intent. That's it. All of the intended specific side effects and preconditions need to be described in hypermedia. You can't infer too much from the methods.
State transition are a useful way to describe the lifecycle of both resources and representiations; but note that they're not necessarily the same thing
A little known fact about UML 2 statecharts: there are two types. The "protocol statechart", describes the lifecycle from the perspective of interaction, but says nothing about internals. An "object statechart", on the other hand. describes the internal states of an object.
A hypermedia representation type describes the former -- it's how an agent knows what hyperlinks are supplements to the current representation, which links are a sensing action to a separate resource, and which links are an state-transitioning action. Perhaps these categories are wrong-headed; maybe there are other sorts of links. But those three came to mind based on what I know of HTML inline links vs. anchors vs. forms. If we're to build more RESTful agents, we need to work on coming up with a general sense of how different sorts of links impact the knowledge of the agent.
BTW, only the origin server knows the internal object lifecycle, and how it corresponds to the protocol state transitions. They may be very dissimilar in practice. That's the whole point of abstraction.
Requesting domain-specific action on a resource requires the use of POST.
State transitions in the Web are modeled with PUT if you modify the representation of the state directly, or POST if you're specifying a more domain-specific (aka "business") action. Anyone who says POST is anathema to the Web is making shit up. POST is essential to the Web Architecture. If you want to remove it in your world, fine, but it's not the Web, and won't help the Web.
The problem with POST is when we abuse it by having it perform things that are more expressive in one of the other methods. GET being the obvious one that needs no hypermedia description. For the other methods, a good design guideline is that you MUST not break the general contract of the HTTP method you choose -- but you SHOULD describe the specific intent of that method in hypermedia.
For example, POST in HTML means "process this form". POST in AtomPub means "create this entry". They're very different intents. The meaning of these actions aren't inherent in POST, they are inherent in the media types (specifically the hyperlinks in an HTML form tag or an AtomPub collection tag). This is no different if you wanted to design an "order" action for an Order Processing media type.
I note that the RESTful Web Services book doesn't like POST, but they did invent their own REST-influenced architecture, ROA, which you can feel free to agree or disagree with. Much of their advice is well intentioned and right, but this one case is, in my opinion, very wrong.
Encoding a business domain in a media type
If you're using a media type to describe domain-specific data, then you have two choices:
a) use a domain-specific media type, and deal with the versioning issues,
b) use a more generic media type that describes actions and their consequences.
(a) is not aligned with the goals of REST because media types aren't meant to change very often. Contracts that change are supposed to be described (and discovered) through hypermedia. Unfortunately, we don't have a media type that does this. (On the other hand, most other approaches don't describe contracts all that well either).
But wait -- one may say, "what about RDF?" Indeed, RDF (and RDFa) does cover a good chunk of this problem. But besides issues of RDF usability, we don't really have a well-understood way of using RDF to describe *contracts* of action
successor-state axioms (i.e. state transition post conditions). It probably could do this, we just haven't really used it that way, to my knowledge. If Linked Data is to be a success, we need RDF (or some other description framework) to help describe what a POST or a PUT means for a particular hyperlink. I have my own ideas on how to do it, but I doubt I'm the only one who has thought about the problem; I just haven't seen too many solutions yet.
On using colloquial XML to describe business domains
One could use colloquial XML (i.e. XML elements that correspond to domain entities and relationships) as your media type and build forward compatibility into the squishy areas; this isn't much different to how people do things in the enterprise today with XML messaging. Just make sure you get your change management straight. Plan for forward compatibility in the XML schema, backward compability in the origin server, and versioning either in the URIspace or the schema, depending on how incompatible the version is. Contrary to rumor, RESTful versioning really isn't all that different than versioning with plain XML messaging, and it's arguably much smoother for forward compatibility than alternatives.
When using colloquial XML in this way, you're not really getting all of the benefits of REST, since, by definition, your media type is way more specific than it should be. The main benefit of building such interactions on the Web architecture is the pervasive use of URIs instead of just embedding primary keys into your documents, a practice that's very common and a big problem with data management in SOA. URIs give the benefit of at least some well-understood semantics (e.g. GET and response codes) without requiring a whole lot of media type description.
That's enough for now. I have more to come...

Leave a comment