15720Re: [rest-discuss] Determining which Media type for post/put
- Jun 17 1:16 PMPeter Williams wrote:
>My position is that assigning URIs to variants is both a REST constraint
> > Using Content-Location, we can associate one application/xhtml+xml
> > variant with multiple combinations of selection headers, i.e. a
> > one-to-many mapping. This can't be done without some means of
> > distinguishing one variant from another, without sniffing content.
> Providing a `content-location` allows more efficient caching by
> allowing mapping a variety of selection headers to a single entity in
> caches. Agreed. On the other hand, vigorous use of `etag` would
> provide similar improvements to the cache hit rate. It is a big step
> from "Content-Location can improve cache hit rates" to, "conneg is
> useless without Content-Location".
and HTTP best-practice. I haven't said "conneg is useless without
Content-Location," particularly as I've kept saying "except for
caching"... I get your meaning, though, but "Content-Location can
improve cache hit rates" is your strawman, not my position.
Over the course of the thread, I may have staked out too rigid a
position, that the only way to distinguish variants from one another is
by assigning Content-Location URIs to them. You are correct, Etag may
be used to distinguish variants, and this can increase cache hit rates
even when Content-Location is absent.
But, this does not follow REST, so it does not change my advice...
>OK. I was giving one example of aberrant cache behavior, which doesn't
> A conforming cache will not respond with an inappropriate
> representation if the server sends an appropriate `vary` header.
apply to the specifics of using Etag in combination with Vary. My way
of doing things is to make my system compliant with HTTP 1.0 caches to
the fullest extent possible, because last I heard there were still
plenty of HTTP 1.0 caches deployed out there on the real-world Web.
So to my way of thinking, conneg should work independently of caching
scheme, i.e. Etag or Expires both work when Vary is combined with
Content-Location... which is probably another reason for that SHOULD.
>The other drawback to relying on Etag to cover for a missing Content-
> (Though it might miss a valid chance to serve a cached entity.)
Location, is that on the real-world, anarchically-scalable Web, myriad
cases exist where a cache may legitimately decide to serve a stale
representation. This loss of control is the tradeoff to caching. By
omitting Content-Location, you're preventing the cache from identifying
the proper variant to send, forcing it to contact the origin server,
which presumably it had good reason to avoid doing (like if that server
is unavailable from the cache's location). When Content-Location is
omitted, much uncertainty is introduced which is otherwise avoided by
following the SHOULD.
>My experience disagrees with your experience. When I first started
> Private caches at the user agent are less susceptible to selection
> criteria explosion. Repeated requests from a single user agent are
> likely to all be quite similar. In my experience private caches are
> far more important than caching intermediates, anyway.
doing Web development in late 1993, it was by downloading Mosaic via my
Compuserve account, and creating pages on my local filesystem. My
first experience with HTTP was in 1994, after I'd opened my own ISP. I
was an early member of the Colorado Internet Cooperative Association,
whose board consisted of most of the authors of "UNIX System
One of whom was Evi (who had a second home in Steamboat Springs, but
went with my non-coop competition because I only offered PPP and she
demanded CSLIP), who, in her position as a professor at CU-Boulder, was
instrumental in the student-led development of squid. The first anyone
really ever heard of squid was at a coop meeting, to an ISP-dominated
audience. So in my (heavily-ISP-weighted) experience, shared caches
are far more important than private.
But, this is just one preference vs. another. I do not take the view
that REST constraints which don't apply to a particular system, are
irrelevant. Thus, constraints intended to increase visibility to
intermediary components are still part of the style, even when we only
care about private caches which don't require us to follow such
You are presenting an edge case of not caring about shared caches,
showing that Content-Location isn't required. I cannot be persuaded
that any edge case nullifies the best-practice advice I'm giving. I
only agree that your edge case exists, not that you're better off by
not meeting the identification of resources constraint.
REST is the Platonic Ideal for the long-term development of a system --
just because you're setting Cache-Control: private today, doesn't mean
you shouldn't be able to change it tomorrow, by just changing the Cache-
Control header. If your system wasn't designed with a long-term view
of REST, then you can't just change Cache-Control, you must also add
So what I'm saying is, start with Content-Location even if you don't
see an immediate need for it. By making it your habit to follow this
best practice, you'll never regret having avoided it. Instead of
tailoring my solutions to the specific needs of the system I'm
developing, I follow REST and develop a Uniform Interface, because I
know that works in the present and will continue to work in the future,
so I won't have to re-architect any system in response to its evolving
needs. Tweaking an existing system's headers is easier than adding new
>Variants are resources. As such, REST requires them to be identified,
> `content-location` is a terribly useful header. Using it does
> increase the cache hit rates for negotiated resources. However,
> skipping `content-location` in a negotiated response does not violate
> any of the REST constraints that i can see.
in order for one variant to be distinguishable from another. Etag does
not meet this constraint, because Etags are transient, in that they
change over time for any given representation. The purpose of
assigning a URI is to declare a static mapping. This is why assigning
URIs to variants is a best practice -- provide one URI for a set of
Etagged entities to map to.
In HTTP, REST's requirement of assigning URIs to variants is reflected
in the SHOULD about Content-Location. So to apply REST in HTTP, the
SHOULD is followed. You are pointing to an edge case, where avoiding
Content-Location can still be made to work. But you haven't explained
why minting those URIs is undesirable, i.e. "works without it" does not
justify avoiding Content-Location. "Compression" justifies avoiding
Content-Location, i.e. ignoring the SHOULD, but I still haven't seen
any other case where that SHOULD shouldn't be taken as a MUST (if, that
is, you're following REST and applying the identification of resources
I still wouldn't want to touch a non-compression conneg system that
avoids Content-Location with a ten-foot pole. There is no simpler way
to develop and maintain a conneg system, than to assign URIs to
variants (except for compression), even if those URIs aren't exposed
beyond the firewall. I've developed enough conneg systems to know that
at some point, most likely more than one point, I will need to examine
variants directly, bypassing the negotiation mechanism entirely (as
opposed to testing the mechanism by altering selection headers).
To me, this is a stronger argument than any edge case where Content-
Location isn't technically needed by a caching scheme -- I don't care,
assign URIs to your variants anyway, because REST requires it, and
because it would be insane to develop and maintain a conneg system
without doing so (except for compression). Spoken from experience.
There is still no downside to assigning URIs to variants, so I still
don't see the point in examining edge cases. Why *not* assign URIs to
variants? What is it we're so desperately trying to avoid here, that we
would disregard best practice by ignoring RFC 2616's SHOULD? Not
caring about shared caching isn't a reason, particularly given that
this is rest-discuss, where our concern is targeting the sweet-spot in
the deployed Web which allows anarchic scalability (shared caching).
The identification of resources constraint, applied in HTTP by using
Content-Location to assign URIs to variants, allows for anarchic
scalability. Edge cases where that level of scalability aren't
required, are not sufficient reason not to apply the constraint anyway,
and don't change best practice. Best practice in REST is to apply REST
constraints and follow HTTP. Assigning URIs to variants is required by
REST and strongly recommended as best practice by HTTP. Even if
avoiding this has no downside today, REST development means not assuming
that tomorrow's needs are the same as today's; design for the future.
So the only advice I can give about assigning URIs to variants, is to
do just exactly that. There is no REST argument *against* doing so,
and a key REST constraint will be met by following this best practice.
This really is as simple as the black-and-white clarity of the advice I
keep giving. Even if one doesn't uderstand it, I promise you that it's
far easier to learn REST by implementing best practices and learning
from them, than trying to learn REST by avoiding best practices in one's
implementations, then trying to rectify the results with REST ex-post-
REST should be any Web system's long-term goal. I don't fault a system
for not implementing a constraint, if applying the constraint carries
an immediate cost which outweighs the constraint's long-term benefits.
This is not such a case. Identification of resources is fundamental,
and has no costs to implement. I would even say that to avoid
assigning URIs to variants, carries greater immediate costs (in terms
of development hours alone) than are incurred by assigning them. So I
still don't see any theoretical or cost-benefit reasons to avoid
assigning URIs to variants.
- << Previous post in topic