Loading ...
Sorry, an error occurred while loading the content.

Thoughts about URLs for a REST driven website

Expand Messages
  • Jorn Wildt
    I am experimenting with a setup where I base a website completely on a REST API. This means creating the REST API first and then only using that for fetching
    Message 1 of 41 , Mar 3 11:55 PM
    • 0 Attachment
      I am experimenting with a setup where I base a website completely on a REST API. This means creating the REST API first and then only using that for fetching data to display in the website. This leads to some troubles with URLs for the website - especially how the web URL should identify the REST resources to display.

      In a traditional website there is a tight integration to the backend database. This means we can use DB identifiers and readable names in our URLs. Something which most people agree on is good for SEO. Example: to show Peters blog we use the URL http://www.mysite.com/blogs/peter.

      But what happens if the backend is a REST API? Now we cannot just write "peter" in the URL since this tells us nothing about how to fetch the "peter" resource. We must instead include the whole URL for the "peter" resource. This URL could be http://rest.mysite.com/feeds/peter which would serve an ATOM feed for our website to format and display nicely. From this follows that our website URL must include the url encoded "peter" reference. Now our URL becomes:

      http://www.mysite.com/blogs/http%3a%2f%2frest.mysite.com%2ffeeds%2fpeter

      The downside of this is that our web URL becomes SEO unfriendly, unreadable and impossible to remember. The upside is that we can now display *any* ATOM feed on our website, not just our own feeds, which in turn happens to be both good and bad. It's bad because evil persons can craft a URL with a reference to an evil hackers ATOM feed and make it look like a URL to our site. It's good because it gives us much more flexibility.

      I could of course publish a URL template for my ATOM resources, stating that "peter" can be mapped to http://rest.mysite.com/feeds/peter.

      But the use of URL templates makes link relations in the ATOM feed less usefull. A link relation must include the complete URL to the related resource. So an ATOM related link to "older entries" could be http://rest.mysite.com/feeds/peter?page=2. Now we *have* to put this complete reference into our website's URL:

      http://rest.mysite.com/blogs/peter/http%3a%2f%2frest.mysite.com%2ffeeds%2fpeter%3fpage%3d2

      So our website's URL becomes more and more obscure if we really don't want to know anything about the REST API's url templates.

      Have I missed something? Comments?

      Thanks.
    • Eric J. Bowman
      ... Let s see if we can t get you back on the path. ... Given 2010-03-20 as a simple identifier, how is the client instructed to build an URL with it? Not
      Message 41 of 41 , Mar 20 4:41 PM
      • 0 Attachment
        Jørn Wildt wrote:
        >
        > >> 1) The official "here can you find the specs" kind of REST
        > >> "sitemap".
        > >
        > > This is exactly the opposite of what Roy means by, "A REST API
        > > should be entered with no prior knowledge beyond the initial URI."
        >
        > Then I am lost again :-(
        >

        Let's see if we can't get you back on the path.

        >
        > > If, given a URI for some resource in a system, I must consult some
        > > other "sitemap"
        > > resource before I can request another URI in the system, then the
        > > API is being driven by out-of-band knowledge, not hypertext.
        >
        > This is not exactly what I am saying. You are _not_ "given a URI for
        > some resource in a system". You are given a simple identifier, a
        > customer number, an order number, or a blog name. Not the complete
        > URL. That "sitemap" tells the client where it can find the search
        > forms for those numbers or names. By looking at the sitemap you can
        > get a URL to the search form for customers. That search form tells
        > you, that by doing a GET on a certain URL (the action) and passing
        > the customer number as "&numer=...", you will get a resource
        > describing the requested customer.
        >

        Given 2010-03-20 as a simple identifier, how is the client instructed
        to build an URL with it?

        Not REST:
        Client has previously loaded some other document into memory (sitemap)
        instructing it to make a GET for /date?iso=2010-03-20 when it encounters
        an ISO date string. Client "somehow knows" this out-of-band info.

        REST:
        Retrieved representation links to some other document (sitemap), which
        may be cached locally, which contains a link for dereferencing. Client
        follows its nose -- i.e. checks another document for <a id='2010-03-20'
        href='/date?iso=2010-03-20'>.

        Also REST:
        Retrieved representation contains some URL-construction code (perhaps a
        form). Client follows its nose -- the values '2010', '03' and '20' are
        entered where appropriate.

        Also REST:
        Retrieved representation links to some document (not a sitemap) which
        contains URL-construction code. Client follows its nose -- in the case
        of my demo, the retrieved representations link to an XSLT stylesheet
        which (as I posted before) contains the code to convert ISO date-string
        instances into URLs for dereferencing and transformation.

        The key here, is for the client to follow hypertext included in the
        representation which returns the "simple identifier", to learn how to
        dereference an URL containing the "simple identifier". While a
        "sitemap" could be used, that really just adds another round-trip
        between client and server.

        What makes the Not REST example wrong, is that the client is expected to
        know how to create the mapping using some knowledge outside (not linked)
        the dereferenced representation which contains the "simple identifier".
        For example, using a browser's client-side storage to cache a lookup
        table, and using script to access name-value pairs from that client-side
        storage for all subsequent requests.

        While such a solution would work, the problem is that some prior URI
        must be dereferenced to create this lookup table. A dereferenced
        representation containing a script which references client-side storage
        would fail, unless that prior URI had been dereferenced.

        When the condition is met, that the client can follow its nose (using
        hypertext) to find everything needed to render a representation
        dereferenced from some URI, then no prior knowledge is needed beyond
        the URI being derefernced.

        OTOH, if the URI being dereferenced cannot be rendered without the
        client having prior knowledge of some other URI that the retrieved
        representation doesn't link to, that prior knowledge is out-of-band.

        To sum up, if your API requires me to first dereference some sort of
        sitemap, before dereferencing any other URIs will work, then your API
        must always be entered from the sitemap URI, instead of from any URI.

        -Eric
      Your message has been successfully submitted and would be delivered to recipients shortly.