Loading ...
Sorry, an error occurred while loading the content.
 

Re: [rest-discuss] Re: Meaning of stateless

Expand Messages
  • Bill Venners
    Hi John, ... Another way to do the anonymous shopping cart is to give it a unique URI. When a guest goes to add an initial item to his or her shopping cart,
    Message 1 of 44 , Apr 18, 2006
      Hi John,

      On Apr 17, 2006, at 9:37 PM, John Elliot wrote:

      > c) A user who arrives at a web-site will initially be
      > 'anonymous', yet
      > their very first request flags the beginning of a session. The very
      > first thing they do might be to 'add item to shopping basket', and the
      > very next thing they do 'login'. 'Shopping basket' is an application
      > layer function, and it is perfectly reasonable for a server to
      > maintain
      > this state and associate it with a 'session'. Unfortunately, not all
      > 'anonymous' users can share the same shopping basket, and the
      > transition
      > from 'anonymous' to 'joe' must migrate the shopping basket. The
      > shopping
      > basket isn't associated with the 'user', it's associated with the
      > 'session'.
      >
      Another way to do the anonymous shopping cart is to give it a unique
      URI. When a guest goes to add an initial item to his or her shopping
      cart, the server stores it not in the session, but in a set of
      shopping cart snapshots, each of which has a unique ID. It then
      redirects the client to a URI that includes that shopping cart
      snapshot ID with one item in it. If they add another item, then the
      server creates another shopping cart snapshot with two items in it
      (which has a new ID), and redirects the client to a new URI with the
      second ID. The client could log in at that point, even using HTTP
      auth. Now we know this is Joe, and the next thing Joe does is add a
      third item to his shopping cart. The server creates a third shopping
      cart snapshot, and redirects the client to a URI that includes the
      unique ID of the third snapshot. In the third snapshot, the server
      associates Joe's user ID with the shopping cart, so only he and
      admins can look at it.

      Any user could look at the shopping cart with one or two items in it,
      because it was created by anonymous (before Joe logged in) but if you
      use an opaque, hard-to-guess token for the ID, then it would be
      highly unlikely that anyone will accidentally go there even if they
      are trying. And even if they did go there, all they would know is
      someone put in these two items. If they added a third item to
      shopping cart, they would essentially bifurcate the shopping cart.
      They would get a new shopping cart snapshot with a new ID that would
      be included in the URI to which they get redirected.

      To me the biggest lesson I learned from reading about REST is that I
      don't need session state with HTTP. I can model everything as
      resources with unique URIs, some of which require authentication and
      authorization. The kind of session I do want is an authentication
      session, which simply means the user doesn't have to provide
      credentials to the client each time they make an HTTP request. The
      user can provide credentials once and then be automatically
      authenticated each subsequent request during their authentication
      session. Cookies with a fallback on URL rewriting seem to work just
      fine for holding this authentication token/session ID. You can even
      have the same user working with two different instances of the
      application from one client this way. I.e., Joe could actually be
      adding different items to two different shopping carts at the same
      time from the same browser.

      > If you know the 'object' and the 'subject' of a request, then along
      > with
      > a single verb you have all of the information necessary to create the
      > context in which to generate an appropriate response.
      >
      Well, there are other possibilities. If a user has landed on one of
      our sites after searching for "rest relaxation," I want to highlight
      those terms in the page. So I need to look at the referrer header. If
      they indicate via their accept language headers that they speak
      French, and they've requested an English version of an article for
      which I have a French translation, I want to add in a prominent link
      to the French version into the English page I send back. I want to
      also try and detect requests that are coming from a device with a
      small screen, and include a prominent link to a mobile version of the
      content.

      (The following snippet is from your follow-up email.)

      > The only thing I'm trying to point out here is that 'session id'
      > should
      > not be in the URL, but it needs to be there. It should be a
      > transparent
      > part of the universal uniform interface, and it should be able to
      > stand
      > as the 'subject' of a request.
      >
      > If this happens, then we can move toward using the 'subject' *and* the
      > 'object' of the request to key a cache of what are otherwise
      > non-cacheable responses to HTTP GET.
      >
      I don't believe this either necessary or desirable. The reason it
      isn't necessary is that you can use ETags to identify different
      representations, including personalized representations, of the same
      resource to caches. The reason it isn't desirable is both because
      many sessions can share the same representations, and because I may
      want to send multiple representations for the same URI and subject
      based on other information in the request (such as referrer or accept-
      language). By using ETags, which already exists in HTTP 1.1, I can
      effectively identify and cache each representation. The fewer
      representations you have, the more caching can help with scalability,
      and that mechanism will always be more flexible. At the extreme case,
      if you really have a different representation for each session, then
      you can include the session ID in the ETag. But most of the time what
      you probably really have in that case is a different representation
      per user, not per session, so you could include a user ID in the
      ETag. But to the extent possible it is better to try and minimize the
      number of resources for which there are so many representations,
      because the fewer the representations the more scalability benefit
      you get from HTTP caching.

      For example, I do want to say, "Welcome, John" on the top of every
      page once you've signed in. But I'm currently planning on attempting
      to have only two representations of each page, one for signed in
      users and one for anonymous. The signed in representation will have
      some JavaScript that grabs the Welcome greeting and insert it
      dynamically on the page. If a client doesn't have JavaScript enabled,
      then they still see they are signed in, because that's one of the two
      representations sent from the server, but they won't see their name
      in a welcome message. To me that's a graceful degradation for non-
      JavaScript clients that I'm willing to accept in exchange for
      improved cache effectiveness. (The signed in/anonymous
      representations only work for clients with cookies enabled. Otherwise
      I have to fall back to URL rewriting, where the caching will be less
      effective, because I'll have a different URI per resource per session.)

      I'm going to try to take the same kind of JavaScript approach with
      search keyword highlighting and links to translations, but I may just
      send those variations as representations from the server if
      JavaScript turns out to be problematic for those use cases. One
      attitude I've heard on this list is that if it isn't cacheable it
      isn't scalable. To me, caching is one tool in the scalability
      toolbox, but not the only one. Another example is using URL-rewriting
      if cookies aren't enabled on the client. Yes, this doesn't make as
      effective use of caching as cookies or even HTTP auth could, but if
      it allows 5% more users to use the site effectively, then it is
      useful. Maybe because of the less effective caching on that 5% you
      need to add one more node to your server cluster.

      In summary, I think ETags provide a very flexible solution to caching
      multiple representations of the same resource, and that using a
      different URI for each bit of new or changed state resulting from
      each HTTP request, you don't need session state. One thing I don't
      understand yet is why everyone says cookies are so evil. Is it really
      cookies or how they are used that is evil? I can see how having
      session state, which can be identified via a cookie, can degrade
      caching effectiveness and break the back button. I can also see
      privacy problems with persistent cookies. But from every REST-
      proponent's disdain for cookies, I feel I must be missing something.
      What is really wrong with cookies?

      Bill
      ----
      Bill Venners
      Editor-in-Chief
      Artima Developer
      http://www.artima.com
    • Jon Hanna
      ... I had a similar experience with ASP and ASP.NET, though in that case it was IIS being set up by the ISP to get paranoid about long URIs (in case there is
      Message 44 of 44 , Apr 25, 2006
        Elliotte Harold wrote:
        > John Elliot wrote:
        >
        >
        >>Does anyone have any data (based on recent bitter experience) on the
        >>real practical upper limit on URL size on the web as it exists at this
        >>point in time?
        >>
        >
        >
        > The only limit I've hit in the last few years was in PHP, a server side
        > framework. I had a huge URL holding up to a couple of thousand fields in
        > the query string, and PHP crapped out; but the browsers could all handle it.
        >
        I had a similar experience with ASP and ASP.NET, though in that case it
        was IIS being set up by the ISP to get paranoid about long URIs (in case
        there is an attempt at some sort of buffer-overflow I think) though I do
        think there are limits even without such a set-up, though the limits
        were fine for once I was doing once I called in a favour to have the
        ISPs rules bent.
      Your message has been successfully submitted and would be delivered to recipients shortly.