Thoughts about URLs for a REST driven website
- I am experimenting with a setup where I base a website completely on a REST API. This means creating the REST API first and then only using that for fetching data to display in the website. This leads to some troubles with URLs for the website - especially how the web URL should identify the REST resources to display.
In a traditional website there is a tight integration to the backend database. This means we can use DB identifiers and readable names in our URLs. Something which most people agree on is good for SEO. Example: to show Peters blog we use the URL http://www.mysite.com/blogs/peter.
But what happens if the backend is a REST API? Now we cannot just write "peter" in the URL since this tells us nothing about how to fetch the "peter" resource. We must instead include the whole URL for the "peter" resource. This URL could be http://rest.mysite.com/feeds/peter which would serve an ATOM feed for our website to format and display nicely. From this follows that our website URL must include the url encoded "peter" reference. Now our URL becomes:
The downside of this is that our web URL becomes SEO unfriendly, unreadable and impossible to remember. The upside is that we can now display *any* ATOM feed on our website, not just our own feeds, which in turn happens to be both good and bad. It's bad because evil persons can craft a URL with a reference to an evil hackers ATOM feed and make it look like a URL to our site. It's good because it gives us much more flexibility.
I could of course publish a URL template for my ATOM resources, stating that "peter" can be mapped to http://rest.mysite.com/feeds/peter.
But the use of URL templates makes link relations in the ATOM feed less usefull. A link relation must include the complete URL to the related resource. So an ATOM related link to "older entries" could be http://rest.mysite.com/feeds/peter?page=2. Now we *have* to put this complete reference into our website's URL:
So our website's URL becomes more and more obscure if we really don't want to know anything about the REST API's url templates.
Have I missed something? Comments?
- Jørn Wildt wrote:
>Let's see if we can't get you back on the path.
> >> 1) The official "here can you find the specs" kind of REST
> >> "sitemap".
> > This is exactly the opposite of what Roy means by, "A REST API
> > should be entered with no prior knowledge beyond the initial URI."
> Then I am lost again :-(
>Given 2010-03-20 as a simple identifier, how is the client instructed
> > If, given a URI for some resource in a system, I must consult some
> > other "sitemap"
> > resource before I can request another URI in the system, then the
> > API is being driven by out-of-band knowledge, not hypertext.
> This is not exactly what I am saying. You are _not_ "given a URI for
> some resource in a system". You are given a simple identifier, a
> customer number, an order number, or a blog name. Not the complete
> URL. That "sitemap" tells the client where it can find the search
> forms for those numbers or names. By looking at the sitemap you can
> get a URL to the search form for customers. That search form tells
> you, that by doing a GET on a certain URL (the action) and passing
> the customer number as "&numer=...", you will get a resource
> describing the requested customer.
to build an URL with it?
Client has previously loaded some other document into memory (sitemap)
instructing it to make a GET for /date?iso=2010-03-20 when it encounters
an ISO date string. Client "somehow knows" this out-of-band info.
Retrieved representation links to some other document (sitemap), which
may be cached locally, which contains a link for dereferencing. Client
follows its nose -- i.e. checks another document for <a id='2010-03-20'
Retrieved representation contains some URL-construction code (perhaps a
form). Client follows its nose -- the values '2010', '03' and '20' are
entered where appropriate.
Retrieved representation links to some document (not a sitemap) which
contains URL-construction code. Client follows its nose -- in the case
of my demo, the retrieved representations link to an XSLT stylesheet
which (as I posted before) contains the code to convert ISO date-string
instances into URLs for dereferencing and transformation.
The key here, is for the client to follow hypertext included in the
representation which returns the "simple identifier", to learn how to
dereference an URL containing the "simple identifier". While a
"sitemap" could be used, that really just adds another round-trip
between client and server.
What makes the Not REST example wrong, is that the client is expected to
know how to create the mapping using some knowledge outside (not linked)
the dereferenced representation which contains the "simple identifier".
For example, using a browser's client-side storage to cache a lookup
table, and using script to access name-value pairs from that client-side
storage for all subsequent requests.
While such a solution would work, the problem is that some prior URI
must be dereferenced to create this lookup table. A dereferenced
representation containing a script which references client-side storage
would fail, unless that prior URI had been dereferenced.
When the condition is met, that the client can follow its nose (using
hypertext) to find everything needed to render a representation
dereferenced from some URI, then no prior knowledge is needed beyond
the URI being derefernced.
OTOH, if the URI being dereferenced cannot be rendered without the
client having prior knowledge of some other URI that the retrieved
representation doesn't link to, that prior knowledge is out-of-band.
To sum up, if your API requires me to first dereference some sort of
sitemap, before dereferencing any other URIs will work, then your API
must always be entered from the sitemap URI, instead of from any URI.