Loading ...
Sorry, an error occurred while loading the content.

Re: [rest-discuss] Determining which Media type for post/put

Expand Messages
  • Mike Kelly
    Hi Eric, Comments in line ... You sound like you re agreeing with me, the way Seb uses the term identifying function implied we were talking abouts
    Message 1 of 82 , Jun 1, 2010
    • 0 Attachment
      Hi Eric,

      Comments in line

      On Tue, Jun 1, 2010 at 5:51 PM, Eric J. Bowman <eric@...> wrote:
      Mike Kelly wrote:
      >
      > Actually, the identifying function of HTTP is URI + any control data.
      >

      Absolutely NOT.  URIs identify _resources_ the control data is used to
      select _representations_ and the two are _not_ the same thing.


      You sound like you're agreeing with me, the way Seb uses the term 'identifying function' implied we were talking abouts representations not resource, which is what I was addressing:

      I don't know if you've ever had to develop a non-trivial hypermedia-driven application that needs to service (amongst other clients) browsers via HTML - but this conflation of resource and representation is *exactly* the problem that I am taking issue with.. you can't make a browser negotiate any other type of representation over HTML, which means you end up having to pretend representations are resources and ignoring negotiation altogether in order to make the representations accessible to browsers.

       

      >
      > If the type attribute in links wasn't designed that way.. What
      > exactly is the point of it, if it is not intended to affect client
      > behavior? There is an argument that if the type attribute wasn't
      > designed to support that case then a mistake was made and it was
      > poorly defined.
      >

      The point of it is to allow us to self-document our APIs.


      What does that even mean? What is the objective of doing that? What are you documenting if, as you're suggesting, it doesn't make any mechanical difference?

       
      It is a
      violation of both the layered-system and identification of resources
      constraints to use @type in any other way.


      Afaik this is nothing to do with either of those constraints

       
      The server is not to dictate
      to the client what media types are acceptable to the client.


      Sure sure, unfortunately the reality is that users of browsers care about certain representations of resources depending on the context and the solution used in the browser+html world right now is to link *directly* to a media-type specific URI, so in practice it is actually *no different at all*, and is in fact a much worse solution since the link itself is less descriptive to the client (the client has no idea the link is intended to be media type specific, URIs are opaque), and the interaction is less visible to intermediaries (since no negotiation is taking place). 

       

      If you need to directly reference a specific variant, assign it a URI
      and sent *that* to the client.  THAT is the solution.  It works.  There
      is no "problem" left to be solved by borking @type.


      .. I take it you haven't tried designing a RESTful system that handles browser clients then.

      Cheers,
      Mike
    • Eric J. Bowman
      ... My position is that assigning URIs to variants is both a REST constraint and HTTP best-practice. I haven t said conneg is useless without
      Message 82 of 82 , Jun 17, 2010
      • 0 Attachment
        Peter Williams wrote:
        >
        > > Using Content-Location, we can associate one application/xhtml+xml
        > > variant with multiple combinations of selection headers, i.e. a
        > > one-to-many mapping.  This can't be done without some means of
        > > distinguishing one variant from another, without sniffing content.
        >
        > Providing a `content-location` allows more efficient caching by
        > allowing mapping a variety of selection headers to a single entity in
        > caches. Agreed. On the other hand, vigorous use of `etag` would
        > provide similar improvements to the cache hit rate. It is a big step
        > from "Content-Location can improve cache hit rates" to, "conneg is
        > useless without Content-Location".
        >

        My position is that assigning URIs to variants is both a REST constraint
        and HTTP best-practice. I haven't said "conneg is useless without
        Content-Location," particularly as I've kept saying "except for
        caching"... I get your meaning, though, but "Content-Location can
        improve cache hit rates" is your strawman, not my position.

        Over the course of the thread, I may have staked out too rigid a
        position, that the only way to distinguish variants from one another is
        by assigning Content-Location URIs to them. You are correct, Etag may
        be used to distinguish variants, and this can increase cache hit rates
        even when Content-Location is absent.

        But, this does not follow REST, so it does not change my advice...

        >
        > A conforming cache will not respond with an inappropriate
        > representation if the server sends an appropriate `vary` header.
        >

        OK. I was giving one example of aberrant cache behavior, which doesn't
        apply to the specifics of using Etag in combination with Vary. My way
        of doing things is to make my system compliant with HTTP 1.0 caches to
        the fullest extent possible, because last I heard there were still
        plenty of HTTP 1.0 caches deployed out there on the real-world Web.

        So to my way of thinking, conneg should work independently of caching
        scheme, i.e. Etag or Expires both work when Vary is combined with
        Content-Location... which is probably another reason for that SHOULD.

        >
        > (Though it might miss a valid chance to serve a cached entity.)
        >

        The other drawback to relying on Etag to cover for a missing Content-
        Location, is that on the real-world, anarchically-scalable Web, myriad
        cases exist where a cache may legitimately decide to serve a stale
        representation. This loss of control is the tradeoff to caching. By
        omitting Content-Location, you're preventing the cache from identifying
        the proper variant to send, forcing it to contact the origin server,
        which presumably it had good reason to avoid doing (like if that server
        is unavailable from the cache's location). When Content-Location is
        omitted, much uncertainty is introduced which is otherwise avoided by
        following the SHOULD.

        >
        > Private caches at the user agent are less susceptible to selection
        > criteria explosion. Repeated requests from a single user agent are
        > likely to all be quite similar. In my experience private caches are
        > far more important than caching intermediates, anyway.
        >

        My experience disagrees with your experience. When I first started
        doing Web development in late 1993, it was by downloading Mosaic via my
        Compuserve account, and creating pages on my local filesystem. My
        first experience with HTTP was in 1994, after I'd opened my own ISP. I
        was an early member of the Colorado Internet Cooperative Association,
        whose board consisted of most of the authors of "UNIX System
        Administration Handbook".

        One of whom was Evi (who had a second home in Steamboat Springs, but
        went with my non-coop competition because I only offered PPP and she
        demanded CSLIP), who, in her position as a professor at CU-Boulder, was
        instrumental in the student-led development of squid. The first anyone
        really ever heard of squid was at a coop meeting, to an ISP-dominated
        audience. So in my (heavily-ISP-weighted) experience, shared caches
        are far more important than private.

        But, this is just one preference vs. another. I do not take the view
        that REST constraints which don't apply to a particular system, are
        irrelevant. Thus, constraints intended to increase visibility to
        intermediary components are still part of the style, even when we only
        care about private caches which don't require us to follow such
        constraints.

        You are presenting an edge case of not caring about shared caches,
        showing that Content-Location isn't required. I cannot be persuaded
        that any edge case nullifies the best-practice advice I'm giving. I
        only agree that your edge case exists, not that you're better off by
        not meeting the identification of resources constraint.

        REST is the Platonic Ideal for the long-term development of a system --
        just because you're setting Cache-Control: private today, doesn't mean
        you shouldn't be able to change it tomorrow, by just changing the Cache-
        Control header. If your system wasn't designed with a long-term view
        of REST, then you can't just change Cache-Control, you must also add
        Content-Location.

        So what I'm saying is, start with Content-Location even if you don't
        see an immediate need for it. By making it your habit to follow this
        best practice, you'll never regret having avoided it. Instead of
        tailoring my solutions to the specific needs of the system I'm
        developing, I follow REST and develop a Uniform Interface, because I
        know that works in the present and will continue to work in the future,
        so I won't have to re-architect any system in response to its evolving
        needs. Tweaking an existing system's headers is easier than adding new
        headers.

        >
        > `content-location` is a terribly useful header. Using it does
        > increase the cache hit rates for negotiated resources. However,
        > skipping `content-location` in a negotiated response does not violate
        > any of the REST constraints that i can see.
        >

        Variants are resources. As such, REST requires them to be identified,
        in order for one variant to be distinguishable from another. Etag does
        not meet this constraint, because Etags are transient, in that they
        change over time for any given representation. The purpose of
        assigning a URI is to declare a static mapping. This is why assigning
        URIs to variants is a best practice -- provide one URI for a set of
        Etagged entities to map to.

        In HTTP, REST's requirement of assigning URIs to variants is reflected
        in the SHOULD about Content-Location. So to apply REST in HTTP, the
        SHOULD is followed. You are pointing to an edge case, where avoiding
        Content-Location can still be made to work. But you haven't explained
        why minting those URIs is undesirable, i.e. "works without it" does not
        justify avoiding Content-Location. "Compression" justifies avoiding
        Content-Location, i.e. ignoring the SHOULD, but I still haven't seen
        any other case where that SHOULD shouldn't be taken as a MUST (if, that
        is, you're following REST and applying the identification of resources
        constraint).

        I still wouldn't want to touch a non-compression conneg system that
        avoids Content-Location with a ten-foot pole. There is no simpler way
        to develop and maintain a conneg system, than to assign URIs to
        variants (except for compression), even if those URIs aren't exposed
        beyond the firewall. I've developed enough conneg systems to know that
        at some point, most likely more than one point, I will need to examine
        variants directly, bypassing the negotiation mechanism entirely (as
        opposed to testing the mechanism by altering selection headers).

        To me, this is a stronger argument than any edge case where Content-
        Location isn't technically needed by a caching scheme -- I don't care,
        assign URIs to your variants anyway, because REST requires it, and
        because it would be insane to develop and maintain a conneg system
        without doing so (except for compression). Spoken from experience.

        There is still no downside to assigning URIs to variants, so I still
        don't see the point in examining edge cases. Why *not* assign URIs to
        variants? What is it we're so desperately trying to avoid here, that we
        would disregard best practice by ignoring RFC 2616's SHOULD? Not
        caring about shared caching isn't a reason, particularly given that
        this is rest-discuss, where our concern is targeting the sweet-spot in
        the deployed Web which allows anarchic scalability (shared caching).

        The identification of resources constraint, applied in HTTP by using
        Content-Location to assign URIs to variants, allows for anarchic
        scalability. Edge cases where that level of scalability aren't
        required, are not sufficient reason not to apply the constraint anyway,
        and don't change best practice. Best practice in REST is to apply REST
        constraints and follow HTTP. Assigning URIs to variants is required by
        REST and strongly recommended as best practice by HTTP. Even if
        avoiding this has no downside today, REST development means not assuming
        that tomorrow's needs are the same as today's; design for the future.

        So the only advice I can give about assigning URIs to variants, is to
        do just exactly that. There is no REST argument *against* doing so,
        and a key REST constraint will be met by following this best practice.
        This really is as simple as the black-and-white clarity of the advice I
        keep giving. Even if one doesn't uderstand it, I promise you that it's
        far easier to learn REST by implementing best practices and learning
        from them, than trying to learn REST by avoiding best practices in one's
        implementations, then trying to rectify the results with REST ex-post-
        facto.

        REST should be any Web system's long-term goal. I don't fault a system
        for not implementing a constraint, if applying the constraint carries
        an immediate cost which outweighs the constraint's long-term benefits.
        This is not such a case. Identification of resources is fundamental,
        and has no costs to implement. I would even say that to avoid
        assigning URIs to variants, carries greater immediate costs (in terms
        of development hours alone) than are incurred by assigning them. So I
        still don't see any theoretical or cost-benefit reasons to avoid
        assigning URIs to variants.

        -Eric
      Your message has been successfully submitted and would be delivered to recipients shortly.