Ever wonder how a sophisticated Web site works? Take Facebook, for example. You can view the source and you can hardly pick out any recognizable HTML, let alone divine how the wizards back at Facebook HQ get the site to work. Now, try viewing the source at a simpler Web site, like ZapThink’s. Sure enough, there’s HTML under the covers, but you still can’t tell from the file the Web server sends to your browser what’s going on behind the scenes (we use WordPress, in case you were wondering).
Put into RESTful terms, there is a separation between resource (e.g., the program running on the server) and the representation (e.g., the Web page it sends to your browser). In fact, this separation is a fundamental REST constraint which allows the resource to be opaque.
When people talk about opacity in the REST context, they are usually referring to Uniform Resource Indicators (URIs). You should be able to construct URIs however you like, the theory goes, and it’s up to the resource to figure out how to respond appropriately. In other words, it’s not up to the client to know how to provide specific instructions to the server, other than by clicking the hyperlinks the resource has previously provided to the client.
But there’s more to the opacity story than opaque URIs. Fundamentally, the client has no way of knowing anything at all about what’s really going on behind the scenes. The resource might be a file, a script, a container, an object, or some complicated combination of these and other kinds of things. There are two important lessons for the techies behind the curtain: first, don’t assume resources come in one flavor, and second, it’s important to understand the full breadth of capabilities and patterns that you can leverage when architecting or building resources. After all, anything you can give a URI to can be a resource.
Exploring the Power of Opacity
Let’s begin our exploration of opacity with HTTP’s POST method. Of the four primary HTTP methods (GET, POST, PUT, and DELETE), POST is the only one that’s not idempotent: in other words, not only does it change the state of the resource, but it does so in a way that calling it twice has a different effect than calling it once. In the RESTful context, you should use POST to initialize a resource. According to the HTTP spec, POST creates a subordinate resource, as the figure below illustrates:
In the interaction above, the client POSTs to the cart resource, which initializes a cart instance, names it “abcde,” and returns a hyperlink to that new subordinate resource to the client. In this context, subordinate means that the abcde comes after cart and a slash in the URI http://example.com/cart/abcde.
Here’s the essential question: just what do cart and abcde represent on the server? cart looks like a directory and abcde looks like a file, given the pathlike structure of the URI. But we know that guess probably isn’t right, because POSTing to the cart resource actually created the abcde resource, which represents the cart instance. So could abcde be an object instance? Perhaps. The bottom line is you can’t tell, because as far as the client is concerned, it doesn’t matter. What matters is that the client now has one (or more) hyperlinks to its own cart that it can interact with via a uniform interface.
One way or the other, however, POST changes the state of the abcde cart instance, which requires a relatively onerous level of processing on the server. To lighten the future load on the server, thus improving its scalability, we may want to cache the representation the resource provides. Fortunately, REST explicitly supports cacheability, as the figure below illustrates:
In the pattern above, a gateway intermediary passes along the POST to the server, fetching a static representation it puts in its cache. As long as clients make requests that aren’t intended to change the state of the resource (namely, GETs), then serving up the cached copy is as good as passing along the request to the underlying resource, until the representation expires from the cache.
Opacity plays a critical role in this example as well, since saying the cached copy is just as good as a response directly from the resource is an example of opacity. As a result, the gateway is entirely transparent to the client, serving in the role of server in interactions with the client but in the role of client in interactions with the underlying server.
The limitation of the example above, of course, is the static nature of the cache. If the client wants to change the state of the resource (via PUT or another POST), then such a request would necessarily expire the cache, requiring the intermediary to pass the request along to the underlying server. In situations where the resource state changes frequently, therefore, caching is of limited value.
Opacity and RESTful Clouds
We can extend the pattern above to provide greater capabilities on the intermediary. In the example below, the intermediary is a full-fledged server in its own right, and the underlying server returns executable server scripts for the intermediary to execute on behalf of the underlying server. In other words, the intermediary caches representations that are themselves server programs (e.g., php scripts). Furthermore, these server scripts are prepopulated with any initial state data in response to the original POST from the client.
Increasing the sophistication of our cache would provide little value, however, if we didn’t have a better way of dealing with state information. Fortunately, REST grants our wishes in this case as well, because it enables us to separate resource state (maintained on the underlying server) from application state, which we can transfer to the client.
In the figure above, after the client has initialized the resource, it may wish to, say, update its cart. So, the user clicks a link that executes a PUT that sends the updated information, along with values from one or more hidden form fields to the intermediary. However, instead of updating resource state, the state information remains in the messages (both requests from the client and representations returned from the intermediary) as long as the client only executes idempotent requests. There is no need to update resource state in this situation, because the scripts on the intermediary know to pass along state information in hidden form fields, for example. When the cart process is complete and the user is ready to submit an order, only then does the client execute another POST, which the intermediary knows to pass along to the underlying server.
However, there’s no strict rule that says that the intermediary can only handle idempotent requests; you could easily put a script on it that would handle POSTs, and similarly, it might make sense to send an idempotent request like a DELETE along to the underlying server for execution. But on the other hand, the rule that the intermediary handles only the idempotent requests may be appropriate in your situation, because POST would then be the only method that could ever change state on the underlying server.
As we explained in an earlier ZapFlash, one of the primary benefits to following the pattern in the figure above is to support elasticity when you put the intermediary server in the Cloud. Because it is stateless, it doesn’t matter which virtual machine (VM) instance replies to any client request, and if a VM instance crashes, we can bootstrap its replacement without losing any state information. In other words, opacity is essential to both the elasticity and fault tolerance of the Cloud, and furthermore, following a RESTful approach provides that opacity.
The ZapThink Take
There’s one more RESTful pattern that ZapThink is particularly interested in: RESTful SOA, naturally. For this pattern we need another kind of intermediary: a RESTful SOA intermediary, in addition to the Cloud-based stateless server intermediary, or anything else we want to abstract for that matter. The figure below illustrates the RESTful SOA pattern.
The role of the RESTful SOA intermediary is to provide abstracted (in other words, opaque) RESTful Service endpoints that follow strict URI formatting rules. Furthermore, this intermediary must handle state information appropriately, that is, following a RESTful approach that transfers state information in messages. As a result, the SOA intermediary can support stateless message protocols for interactions with Service consumers while remaining stateless itself. Most ESBs maintain state, and therefore a RESTful SOA intermediary wouldn’t be a typical ESB, although it could certainly route messages to one.
So, which pattern is the best one? As we say in our Licensed ZapThink Architect (LZA) and Cloud Computing for Architects (CCA) courses, it depends. The architect is looking for the right tool for the job. You must understand the problem before recommending the appropriate solution. We cover REST-based SOA in our LZA course (coming to Johannesburg) and RESTful Clouds in the CCA course (coming to London, DC, and San Diego). See you there!
Image credit: Derek Keats