Loading ...
Sorry, an error occurred while loading the content.

Right way to represent hierarchical relationship between resources

Expand Messages
  • titel
    Hello there, I ve just joined the REST Discuss group and I have several questions that I ve still not found and answer for. One of these is about hierarchical
    Message 1 of 17 , Oct 26, 2011
    • 0 Attachment
      Hello there,

      I've just joined the REST Discuss group and I have several questions that I've still not found and answer for.

      One of these is about hierarchical relationships (many belong to one) in a RESTfull API.

      For instance if there are two kinds of resources, articles and comments, with the following URLs:

      GET /articles - lists the articles
      GET /articles/:ID - shows one specific article

      GET /comment/:ID - gives back one comment

      what is the right way to have an API list the comments that belong to one of the articles.

      One option I thought of would be to have something like:

      GET /articles/:ID/comments

      But this doesn't feel quite right and it doesn't seem to scale if the nesting is more than one level deep.

      What are your thoughts on this?

      Constantin Tovisi
    • Alessandro Nadalin
      ... Hi Constantin, the way you structure your URLs does not really matters with REST, but, as many like me point out, it s cool to provide nice URLs to your
      Message 2 of 17 , Oct 26, 2011
      • 0 Attachment
        On Wed, Oct 26, 2011 at 11:43 AM, titel <constantin.tovisi@...> wrote:
        >
        >
        >
        > Hello there,
        >
        > I've just joined the REST Discuss group and I have several questions that I've still not found and answer for.
        >
        > One of these is about hierarchical relationships (many belong to one) in a RESTfull API.
        >
        > For instance if there are two kinds of resources, articles and comments, with the following URLs:
        >
        > GET /articles - lists the articles
        > GET /articles/:ID - shows one specific article
        >
        > GET /comment/:ID - gives back one comment
        >
        > what is the right way to have an API list the comments that belong to one of the articles.
        >
        > One option I thought of would be to have something like:
        >
        > GET /articles/:ID/comments
        >
        > But this doesn't feel quite right and it doesn't seem to scale if the nesting is more than one level deep.
        >
        > What are your thoughts on this?

        Hi Constantin,
        the way you structure your URLs does not really matters with REST,
        but, as many like me point out, it's cool to provide nice URLs to your
        clients if you correctly implement REST's hypermedia tenet.
        I still don't have a firm idea about your use case, but some months
        ago I was looking at HTSQL[1] do basically do this kinda things.


        [1] http://htsql.org/doc/introduction.html

        >
        > Constantin Tovisi
        >
        >


        --
        Nadalin Alessandro
        www.odino.org
        www.twitter.com/_odino_
      • Erik Mogensen
        ... Hierarchies in a RESTful API should be expressed as links between parents and children, not what the URLs look like. Choose a URL structure that you think
        Message 3 of 17 , Oct 26, 2011
        • 0 Attachment
          On Wed, Oct 26, 2011 at 11:43 AM, titel <constantin.tovisi@...> wrote:
           

          Hello there,

          I've just joined the REST Discuss group and I have several questions that I've still not found and answer for.

          One of these is about hierarchical relationships (many belong to one) in a RESTfull API.


          Hierarchies in a RESTful API should be expressed as links between parents and children, not what the URLs look like.  Choose a URL structure that you think will not change for a few years.

          e.g. Amazon's web page for "Childrens' Books" might be structurally "beneath" the page for "Books".  A hierachy if I ever saw one.  However the URLs don't reflect this.  Instead, the page for Books includes a link to the "child" pages, and the "Childrens' books" page links back up to the "Books" page.

          Likewise in an API, you should expose links between your hierarchical resources indicating how to go "up" and/or "down" the hierarchy.

          There's an internet-draft for expressing these relations inside atom links, although IMHO the "up" link relation as it's defined is a bit limiting, since it says that the resource is a "list of parent resources", when the normal case would be (in a hierarchy) that an item only has a single parent.  I would have fixed it by having many "up" links when a resource belongs to different hierarchies, or is part of a directed graph.  But that's me.

          Here's the internet draft:

          --
          -mogsie-
        • Jason Erickson
          I keep hearing that the way you structure you URL s doesn t really matter in REST, but doesn t it affect caching? I could definitely be wrong about this, if
          Message 4 of 17 , Oct 26, 2011
          • 0 Attachment
            I keep hearing that the way you structure you URL's doesn't really matter in REST, but doesn't it affect caching? I could definitely be wrong about this, if anyone would set me straight I would appreciate it.

            Given two options:
            * Hierarchical: /articles/:ID/comments/:ID
            * Flat: /articles/:ID and /comments/:ID

            Say all are cacheable, we have cached an article with ID=1 and comment with ID=2 So my understanding is, with POST or PUT:
            * Hierarchical:
            ** POST /articles/1/comments will invalidate the cache for /articles/1/comments, but not for /articles/1
            ** POST /articles/ will invalidate the cache for /articles, /articles/1, /articles/1/comments and /articles/1/comments/2
            * Flat:
            ** POST /comments/ will invalidate the cache for /comments/ and /comments/2 and nothing else.
            ** POST /articles/ will invalidate the cache for /articles/ and /articles/1 and nothing else (leaving the cache for comments intact).

            First, is this supposed to be how things work (i.e. some kind of spec)?

            Second, whether there is a spec or not, is it actually how things work in proxies? (I know local browser caches or iPhone cache doesn't doesn't really work that way.)

            On Oct 26, 2011, at 5:28 AM, Alessandro Nadalin wrote:

            > On Wed, Oct 26, 2011 at 11:43 AM, titel <constantin.tovisi@...> wrote:
            >>
            >>
            >>
            >> Hello there,
            >>
            >> I've just joined the REST Discuss group and I have several questions that I've still not found and answer for.
            >>
            >> One of these is about hierarchical relationships (many belong to one) in a RESTfull API.
            >>
            >> For instance if there are two kinds of resources, articles and comments, with the following URLs:
            >>
            >> GET /articles - lists the articles
            >> GET /articles/:ID - shows one specific article
            >>
            >> GET /comment/:ID - gives back one comment
            >>
            >> what is the right way to have an API list the comments that belong to one of the articles.
            >>
            >> One option I thought of would be to have something like:
            >>
            >> GET /articles/:ID/comments
            >>
            >> But this doesn't feel quite right and it doesn't seem to scale if the nesting is more than one level deep.
            >>
            >> What are your thoughts on this?
            >
            > Hi Constantin,
            > the way you structure your URLs does not really matters with REST,
            > but, as many like me point out, it's cool to provide nice URLs to your
            > clients if you correctly implement REST's hypermedia tenet.
            > I still don't have a firm idea about your use case, but some months
            > ago I was looking at HTSQL[1] do basically do this kinda things.
            >
            >
            > [1] http://htsql.org/doc/introduction.html
            >
            >>
            >> Constantin Tovisi
            >>
            >>
            >
            >
            > --
            > Nadalin Alessandro
            > www.odino.org
            > www.twitter.com/_odino_
            >
            >
            > ------------------------------------
            >
            > Yahoo! Groups Links
            >
            >
            >
          • Erik Mogensen
            ... [...] ... [...] ... POST invalidating a resource [1] does not imply invalidating a sub resource (in the hierarchical sense of the URI Generic Syntax
            Message 5 of 17 , Oct 26, 2011
            • 0 Attachment
              On Wed, Oct 26, 2011 at 8:10 PM, Jason Erickson <jason@...> wrote:
              I keep hearing that the way you structure you URL's doesn't really matter in REST, but doesn't it affect caching?  I could definitely be wrong about this, if anyone would set me straight I would appreciate it.
              [...] 
              ** POST /articles/ will invalidate the cache for /articles, /articles/1, /articles/1/comments and /articles/1/comments/2
              [...] 
              First, is this supposed to be how things work (i.e. some kind of spec)?

              POST invalidating a resource [1] does not imply invalidating a "sub" resource (in the hierarchical sense of the URI Generic Syntax [2]).  a POST invalidates the URI itself, perhaps a Location or Content-Location, but only identifiable resources.  Imagine POSTing to the root resource ("/") of any server, automatically invalidating all caches.  Crazy :-)

              URI Generic Syntax specifies that URIs use a hierarchical syntax, and that URIs have a hierarchical portion (path component) and a non-hierarchical part (query component), but URIs the primary role of an URI is to identify a resource.

              If you have a resource that is a logical hierarchy, it's tempting to re-use the hierarchy in the URI path component.  I would ask you if you're pretty sure that, three years from now, or ten years from now, your resources are still in the same hierarchy.

              Second, whether there is a spec or not, is it actually how things work in proxies? (I know local browser caches or iPhone cache doesn't doesn't really work that way.)

              I'm pretty sure proxies don't do this either. 

              -- 
              -mogsie-
            • Jason Erickson
              Thanks, that clears things up quite a bit for me and also shows that URL structure indeed does not matter. So if I wanted the behavior of invalidating
              Message 6 of 17 , Oct 26, 2011
              • 0 Attachment
                Thanks, that clears things up quite a bit for me and also shows that URL structure indeed does not matter.  So if I wanted the behavior of invalidating sub-resources, there's no way to do it in a guaranteed way and the only way to do it at all would be to ask the client (in documentation) to explicitly revalidate sub-resources.  

                Is there any way to tell a client in a response to invalidate any resources? (For example, I PUT to /articles/1 and I'd like to say in the response that the cached version of /articles/ and /articles/1/comments are not fresh.)

                On Oct 26, 2011, at 12:13 PM, Erik Mogensen wrote:

                 

                On Wed, Oct 26, 2011 at 8:10 PM, Jason Erickson <jason@...> wrote:

                I keep hearing that the way you structure you URL's doesn't really matter in REST, but doesn't it affect caching?  I could definitely be wrong about this, if anyone would set me straight I would appreciate it.
                [...] 
                ** POST /articles/ will invalidate the cache for /articles, /articles/1, /articles/1/comments and /articles/1/comments/2
                [...] 
                First, is this supposed to be how things work (i.e. some kind of spec)?

                POST invalidating a resource [1] does not imply invalidating a "sub" resource (in the hierarchical sense of the URI Generic Syntax [2]).  a POST invalidates the URI itself, perhaps a Location or Content-Location, but only identifiable resources.  Imagine POSTing to the root resource ("/") of any server, automatically invalidating all caches.  Crazy :-)

                URI Generic Syntax specifies that URIs use a hierarchical syntax, and that URIs have a hierarchical portion (path component) and a non-hierarchical part (query component), but URIs the primary role of an URI is to identify a resource.

                If you have a resource that is a logical hierarchy, it's tempting to re-use the hierarchy in the URI path component.  I would ask you if you're pretty sure that, three years from now, or ten years from now, your resources are still in the same hierarchy.

                Second, whether there is a spec or not, is it actually how things work in proxies? (I know local browser caches or iPhone cache doesn't doesn't really work that way.)

                I'm pretty sure proxies don't do this either. 

                -- 
                -mogsie-


              • Erik Wilde
                hello. ... nope. a cache is not something you can manipulate at will from the server. however, by serving the right metadata (modification dates and/or etags)
                Message 7 of 17 , Oct 26, 2011
                • 0 Attachment
                  hello.

                  On 2011-10-26 15:41 , Jason Erickson wrote:
                  > Is there any way to tell a client in a response to invalidate any
                  > resources? (For example, I PUT to /articles/1 and I'd like to say in the
                  > response that the cached version of /articles/ and /articles/1/comments
                  > are not fresh.)

                  nope. a cache is not something you can manipulate at will from the
                  server. however, by serving the right metadata (modification dates
                  and/or etags) you can make that should the client decide to interact
                  with /articles/ again, any cached copy will be stale. the important
                  thing here is that servers are unaware of the existence of caches, which
                  are simply optimizing intermediaries. the only thing that counts is that
                  client/server communications are designed in a way such that those
                  intermediaries can do their work as effectively as possible.

                  cheers,

                  dret.

                  --
                  erik wilde | mailto:dret@... - tel:+1-510-2061079 |
                  | UC Berkeley - School of Information (ISchool) |
                  | http://dret.net/netdret http://twitter.com/dret |
                • Erik Mogensen
                  ... Exactly. The problem isn t directly restricted to PUT and invalidation, but a general invalidation issue. It s one of the three big things that CS can t
                  Message 8 of 17 , Oct 27, 2011
                  • 0 Attachment
                    On Thu, Oct 27, 2011 at 1:37 AM, Erik Wilde <dret@...> wrote:
                    On 2011-10-26 15:41 , Jason Erickson wrote:
                    > Is there any way to tell a client in a response to invalidate any
                    > resources? (For example, I PUT to /articles/1 and I'd like to say in the
                    > response that the cached version of /articles/ and /articles/1/comments
                    > are not fresh.)

                    [...] the important
                    thing here is that servers are unaware of the existence of caches, which
                    are simply optimizing intermediaries.

                    Exactly.

                    The problem isn't directly restricted to PUT and invalidation, but a general invalidation issue.  It's one of the three big things that CS can't get right.  The other is off-by-one bugs.

                    Jason, imagine a server and two caches (that don't know about each other, because they're on the big old internet, or in two separate company intranets).  Two users work with the same origin server, but go through different caches. The caches will have different sets of resources cached, and may also invalidate resources because a POST went through.  So even if a resource _only_ ever is invalidated by means of a POST or PUT, caches still have to revalidate their cached items from time to time, since the POST might not go through that particular cache.

                    Allowing POST to invalidate more resources than the request URI only makes the invaldation issue harder, I believe.
                    --
                    -mogsie-
                  • Constantin Tovisi
                    First of all, I want to thank all the ones who contributed so far. Secondly, I keep hearing that URLs don t matter in a RESTful system, and this is something
                    Message 9 of 17 , Oct 27, 2011
                    • 0 Attachment
                      First of all, I want to thank all the ones who contributed so far. Secondly, I keep hearing that URLs don't matter in a RESTful system, and this is something that I'm aware of. I am as well aware of HATEOAS and that all the API should be 'browsable' through link relations.

                      However, people seem to get caught in this thing and not be able to look into my issue as a whole. So I'm going to try my luck one more time.

                      Regardless of the URL itself, let's consider the next example (continued from the example I started with):

                      GET {SOME_URL} - lists all the articles
                      GET {SOME_URL}/:ID - shows one specific article

                      GET {OTHER_URL} - lists all the comments

                      My question now is, how do you represent another resource that is related hierarchically to the previous one? I guess that my question ultimately resumes to: how would you get a list resources belonging to another one in a RESTfull system?

                      To go on with my example:

                      Does it make more sense to have another url all together where all comments belonging to one articles reside?

                      GET {YET_ANOTHER_URL}/:ARTICLE_ID - shows comments belonging to article ARTICLE_ID

                      Or have a single place where all comments are, and somehow filter the ones belonging to an article through something like a query string.

                      GET {OTHER_URL}?belongs_to=:ARTICLE_ID - lists all the comments that belong to the article with the ARTICLE_ID id

                      Constantin TOVISI

                      0752 860.612
                      constantin.tovisi@...



                      On Thu, Oct 27, 2011 at 10:17 AM, Erik Mogensen <erik@...> wrote:
                       



                      On Thu, Oct 27, 2011 at 1:37 AM, Erik Wilde <dret@...> wrote:
                      On 2011-10-26 15:41 , Jason Erickson wrote:
                      > Is there any way to tell a client in a response to invalidate any
                      > resources? (For example, I PUT to /articles/1 and I'd like to say in the
                      > response that the cached version of /articles/ and /articles/1/comments
                      > are not fresh.)

                      [...] the important
                      thing here is that servers are unaware of the existence of caches, which
                      are simply optimizing intermediaries.

                      Exactly.

                      The problem isn't directly restricted to PUT and invalidation, but a general invalidation issue.  It's one of the three big things that CS can't get right.  The other is off-by-one bugs.

                      Jason, imagine a server and two caches (that don't know about each other, because they're on the big old internet, or in two separate company intranets).  Two users work with the same origin server, but go through different caches. The caches will have different sets of resources cached, and may also invalidate resources because a POST went through.  So even if a resource _only_ ever is invalidated by means of a POST or PUT, caches still have to revalidate their cached items from time to time, since the POST might not go through that particular cache.

                      Allowing POST to invalidate more resources than the request URI only makes the invaldation issue harder, I believe.
                      --
                      -mogsie-


                    • Erlend Hamnaberg
                      In HTTP there really is no way of expressing this releationship. It s up to the hypermedia type to express such linking. You could however, express the
                      Message 10 of 17 , Oct 27, 2011
                      • 0 Attachment
                        In HTTP there really is no way of expressing this releationship. 
                        It's up to the hypermedia type to express such linking.

                        You could however, express the relationship using link-relations [1].

                        Consider Atom, and how it uses links to express an alternate version of an entry.

                        <feed>
                         ....

                         <entry>
                           ...
                           <link rel="alternate" href="some-href"/>
                        </entry>
                        </feed>

                        We could easily extend this to allow for comments in some way. Assuming the comments are a feed as well.

                        <feed>
                         ....
                         <link rel="up" href="http://example.com/article/1"/> 
                         
                         <entry>
                           ...
                           <link rel="related" href="some-article-href"/>
                        </entry>
                        </feed>



                        --
                        Erlend

                        On Thu, Oct 27, 2011 at 10:14 AM, Constantin Tovisi <constantin.tovisi@...> wrote:
                         

                        First of all, I want to thank all the ones who contributed so far. Secondly, I keep hearing that URLs don't matter in a RESTful system, and this is something that I'm aware of. I am as well aware of HATEOAS and that all the API should be 'browsable' through link relations.

                        However, people seem to get caught in this thing and not be able to look into my issue as a whole. So I'm going to try my luck one more time.

                        Regardless of the URL itself, let's consider the next example (continued from the example I started with):

                        GET {SOME_URL} - lists all the articles
                        GET {SOME_URL}/:ID - shows one specific article

                        GET {OTHER_URL} - lists all the comments

                        My question now is, how do you represent another resource that is related hierarchically to the previous one? I guess that my question ultimately resumes to: how would you get a list resources belonging to another one in a RESTfull system?

                        To go on with my example:

                        Does it make more sense to have another url all together where all comments belonging to one articles reside?

                        GET {YET_ANOTHER_URL}/:ARTICLE_ID - shows comments belonging to article ARTICLE_ID

                        Or have a single place where all comments are, and somehow filter the ones belonging to an article through something like a query string.

                        GET {OTHER_URL}?belongs_to=:ARTICLE_ID - lists all the comments that belong to the article with the ARTICLE_ID id

                        Constantin TOVISI

                        0752 860.612
                        constantin.tovisi@...




                        On Thu, Oct 27, 2011 at 10:17 AM, Erik Mogensen <erik@...> wrote:
                         



                        On Thu, Oct 27, 2011 at 1:37 AM, Erik Wilde <dret@...> wrote:
                        On 2011-10-26 15:41 , Jason Erickson wrote:
                        > Is there any way to tell a client in a response to invalidate any
                        > resources? (For example, I PUT to /articles/1 and I'd like to say in the
                        > response that the cached version of /articles/ and /articles/1/comments
                        > are not fresh.)

                        [...] the important
                        thing here is that servers are unaware of the existence of caches, which
                        are simply optimizing intermediaries.

                        Exactly.

                        The problem isn't directly restricted to PUT and invalidation, but a general invalidation issue.  It's one of the three big things that CS can't get right.  The other is off-by-one bugs.

                        Jason, imagine a server and two caches (that don't know about each other, because they're on the big old internet, or in two separate company intranets).  Two users work with the same origin server, but go through different caches. The caches will have different sets of resources cached, and may also invalidate resources because a POST went through.  So even if a resource _only_ ever is invalidated by means of a POST or PUT, caches still have to revalidate their cached items from time to time, since the POST might not go through that particular cache.

                        Allowing POST to invalidate more resources than the request URI only makes the invaldation issue harder, I believe.
                        --
                        -mogsie-



                      • Erik Mogensen
                        On Thu, Oct 27, 2011 at 10:14 AM, Constantin Tovisi
                        Message 11 of 17 , Oct 27, 2011
                        • 0 Attachment
                          On Thu, Oct 27, 2011 at 10:14 AM, Constantin Tovisi <constantin.tovisi@...> wrote:
                          Does it make more sense to have another url all together where all comments belonging to one articles reside?

                          GET {YET_ANOTHER_URL}/:ARTICLE_ID - shows comments belonging to article ARTICLE_ID

                          Or have a single place where all comments are, and somehow filter the ones belonging to an article through something like a query string.

                          GET {OTHER_URL}?belongs_to=:ARTICLE_ID - lists all the comments that belong to the article with the ARTICLE_ID id


                          As a client side developer, you shouldn't know these things, you should discover them.  In the HTML case a browser gets
                               <a href="/yet-another-url/4534">Comments</a>
                          and follows the link. This is the REST ideal, that clients don't know how URIs are structured "a priori" but discover URIs (or their structure) at run time.  A client doesn't care if the URL happened to be
                               <a href="/other-url?belongs_to=4534">Comments</a>

                          As a server side developer, you must of course care about these things, and choosing one over the other has more to do with style and your own sense of longevity, e.g. what URI structure fits your scenario, current technology stack.

                          If you have full control over the client and the server, you can of course do what you choose, but then I would call it a HTTP API or even an RPC API, since that would be a more correct description.
                          --
                          -mogsie-
                        • Mike Kelly
                          Yes you can do this. Mark and I published a draft of a mechanism (LCI) which solves this exact problem:
                          Message 12 of 17 , Oct 27, 2011
                          • 0 Attachment
                            Yes you can do this.

                            Mark and I published a draft of a mechanism (LCI) which solves this exact problem:


                            Here's a blog post outlining how it works:


                            Cheers,
                            Mike

                            On Wed, Oct 26, 2011 at 11:41 PM, Jason Erickson <jason@...> wrote:


                            Thanks, that clears things up quite a bit for me and also shows that URL structure indeed does not matter.  So if I wanted the behavior of invalidating sub-resources, there's no way to do it in a guaranteed way and the only way to do it at all would be to ask the client (in documentation) to explicitly revalidate sub-resources.  

                            Is there any way to tell a client in a response to invalidate any resources? (For example, I PUT to /articles/1 and I'd like to say in the response that the cached version of /articles/ and /articles/1/comments are not fresh.)

                            On Oct 26, 2011, at 12:13 PM, Erik Mogensen wrote:

                             

                            On Wed, Oct 26, 2011 at 8:10 PM, Jason Erickson <jason@...> wrote:

                            I keep hearing that the way you structure you URL's doesn't really matter in REST, but doesn't it affect caching?  I could definitely be wrong about this, if anyone would set me straight I would appreciate it.
                            [...] 
                            ** POST /articles/ will invalidate the cache for /articles, /articles/1, /articles/1/comments and /articles/1/comments/2
                            [...] 
                            First, is this supposed to be how things work (i.e. some kind of spec)?

                            POST invalidating a resource [1] does not imply invalidating a "sub" resource (in the hierarchical sense of the URI Generic Syntax [2]).  a POST invalidates the URI itself, perhaps a Location or Content-Location, but only identifiable resources.  Imagine POSTing to the root resource ("/") of any server, automatically invalidating all caches.  Crazy :-)

                            URI Generic Syntax specifies that URIs use a hierarchical syntax, and that URIs have a hierarchical portion (path component) and a non-hierarchical part (query component), but URIs the primary role of an URI is to identify a resource.

                            If you have a resource that is a logical hierarchy, it's tempting to re-use the hierarchy in the URI path component.  I would ask you if you're pretty sure that, three years from now, or ten years from now, your resources are still in the same hierarchy.

                            Second, whether there is a spec or not, is it actually how things work in proxies? (I know local browser caches or iPhone cache doesn't doesn't really work that way.)

                            I'm pretty sure proxies don't do this either. 

                            -- 
                            -mogsie-





                          • Mike Kelly
                            Invalidation mechanisms are useful for gateway (reverse proxy) caching layers. If servers were completely unaware of intermediaries what would be the purpose
                            Message 13 of 17 , Oct 27, 2011
                            • 0 Attachment
                              Invalidation mechanisms are useful for gateway (reverse proxy) caching layers.

                              If servers were completely unaware of intermediaries what would be the
                              purpose of the s-maxage cache-control directive?

                              Cheers,
                              Mike

                              On Thu, Oct 27, 2011 at 12:37 AM, Erik Wilde <dret@...> wrote:
                              > hello.
                              >
                              > On 2011-10-26 15:41 , Jason Erickson wrote:
                              >> Is there any way to tell a client in a response to invalidate any
                              >> resources? (For example, I PUT to /articles/1 and I'd like to say in the
                              >> response that the cached version of /articles/ and /articles/1/comments
                              >> are not fresh.)
                              >
                              > nope. a cache is not something you can manipulate at will from the
                              > server. however, by serving the right metadata (modification dates
                              > and/or etags) you can make that should the client decide to interact
                              > with /articles/ again, any cached copy will be stale. the important
                              > thing here is that servers are unaware of the existence of caches, which
                              > are simply optimizing intermediaries. the only thing that counts is that
                              > client/server communications are designed in a way such that those
                              > intermediaries can do their work as effectively as possible.
                              >
                              > cheers,
                              >
                              > dret.
                              >
                              > --
                              > erik wilde | mailto:dret@...  -  tel:+1-510-2061079 |
                              >            | UC Berkeley  -  School of Information (ISchool) |
                              >            | http://dret.net/netdret http://twitter.com/dret |
                              >
                              >
                              > ------------------------------------
                              >
                              > Yahoo! Groups Links
                              >
                              >
                              >
                              >
                            • Erik Wilde
                              hello. ... i hope i did not sound as if servers were not aware of the fact that there can be caches. of course they are, and that s the reason why serving
                              Message 14 of 17 , Oct 27, 2011
                              • 0 Attachment
                                hello.

                                On 2011-10-27 07:34 , Mike Kelly wrote:
                                > If servers were completely unaware of intermediaries what would be the
                                > purpose of the s-maxage cache-control directive?

                                i hope i did not sound as if servers were not aware of the fact that
                                there can be caches. of course they are, and that's the reason why
                                serving things correctly is so important. but apart from the one
                                scenario you're mentioning (origin server and tightly coupled reverse
                                proxy), servers have no way to tell if there are any intermediaries in
                                the chain and where they might be. all they can do is rely on the fact
                                that if there are any, they have to play by the rules.

                                http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-00 works
                                around this by assuming that origin server and cache are tightly
                                coupled. since it adds to HTTP, you cannot rely on it unless you can
                                guarantee that all intermediaries understand it. which is close to
                                impossible outside of closed environments, but a valid assumption in a
                                controlled setting.

                                cheers,

                                dret.

                                --
                                erik wilde | mailto:dret@... - tel:+1-510-2061079 |
                                | UC Berkeley - School of Information (ISchool) |
                                | http://dret.net/netdret http://twitter.com/dret |
                              • Mike Kelly
                                Right, the primary use for LCI is with gateway caches - just wanted to clarify that this is actually possible with certain types of cache, which didn t seem
                                Message 15 of 17 , Oct 27, 2011
                                • 0 Attachment
                                  Right, the primary use for LCI is with gateway caches - just wanted to
                                  clarify that this is actually possible with certain types of cache,
                                  which didn't seem clear in your response.

                                  Fwiw, if the mechanism was adopted by browsers in their private
                                  caches; you could also rely on invalidation, to a lesser extent, for
                                  invalidating privately cached resources on the browser too. It's not a
                                  silver bullet, but could potentially allow you some more breathing
                                  room on your expiration lengths.

                                  I also forgot to mention that the httpbis draft has a similar (but
                                  limited) invalidation mechanism via the Content-Location and Location
                                  headers:
                                  http://tools.ietf.org/html/draft-ietf-httpbis-p6-cache-16#section-2.5

                                  Cheers,
                                  Mike

                                  On Thu, Oct 27, 2011 at 3:42 PM, Erik Wilde <dret@...> wrote:
                                  > hello.
                                  >
                                  > On 2011-10-27 07:34 , Mike Kelly wrote:
                                  >>
                                  >> If servers were completely unaware of intermediaries what would be the
                                  >> purpose of the s-maxage cache-control directive?
                                  >
                                  > i hope i did not sound as if servers were not aware of the fact that there
                                  > can be caches. of course they are, and that's the reason why serving things
                                  > correctly is so important. but apart from the one scenario you're mentioning
                                  > (origin server and tightly coupled reverse proxy), servers have no way to
                                  > tell if there are any intermediaries in the chain and where they might be.
                                  > all they can do is rely on the fact that if there are any, they have to play
                                  > by the rules.
                                  >
                                  > http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-00 works around
                                  > this by assuming that origin server and cache are tightly coupled. since it
                                  > adds to HTTP, you cannot rely on it unless you can guarantee that all
                                  > intermediaries understand it. which is close to impossible outside of closed
                                  > environments, but a valid assumption in a controlled setting.
                                  >
                                  > cheers,
                                  >
                                  > dret.
                                  >
                                  > --
                                  > erik wilde | mailto:dret@...  -  tel:+1-510-2061079 |
                                  >           | UC Berkeley  -  School of Information (ISchool) |
                                  >           | http://dret.net/netdret http://twitter.com/dret |
                                  >
                                • Jason Erickson
                                  This would be very useful in some scenarios for private caches. The painful use case in my experience is nothing to do with federated caches and more to do
                                  Message 16 of 17 , Oct 27, 2011
                                  • 0 Attachment
                                    This would be very useful in some scenarios for private caches.  The painful use case in my experience is nothing to do with federated caches and more to do with a particular client (usu. a human) doing something and then being confused when other things don't reflect that change immediately.  

                                    In my example (I PUT to /articles/1 and I'd like to say in the response that the cached version of /articles/ and /articles/1/comments are not fresh) there are a few ways of handling it:
                                    1) Just don't allow caching /articles and /articles/1/comments
                                    2) In out-of-band documentation, indicate that PUT, POST or DELETE to /articles/{id} will make stale (semantically) /articles and /articles/{id}/comments and the client should force revalidation on these manually to get the correct version.
                                    3) Use the Link Cache Invalidation mechanism to indicate to the private cache that /articles and /articles/1/comments should be invalidated.

                                    Since the private cache is implemented by a browser or a library (ostensibly) known to the client, the client can know whether it's private cache understands the Link Cache Invalidation mechanism.  If, in the future, the Link Cache Invalidation mechanism is accepted and adopted, you could start to supplement option 2 (client responsibility) with option 3 (cache manager responsibility), although you couldn't replace 2 with 3 entirely.  The client could tell whether it needed to take responsibility for it or not.  Even if the client had to take responsibility, the documentation could indicate refer to the Link Cache Invalidation documentation and the client could use that to implement option 2.

                                    Right?

                                    On Oct 27, 2011, at 8:24 AM, Mike Kelly wrote:

                                     

                                    Fwiw, if the mechanism was adopted by browsers in their private
                                    caches; you could also rely on invalidation, to a lesser extent, for
                                    invalidating privately cached resources on the browser too. It's not a
                                    silver bullet, but could potentially allow you some more breathing
                                    room on your expiration lengths.


                                  • Philipp Meier
                                    Am 27.10.2011 21:01, schrieb Jason Erickson: 3) Use the Link Cache ... In a browser environment you can access the reponse headers of XMLHttpRequest reponses.
                                    Message 17 of 17 , Nov 17, 2011
                                    • 0 Attachment
                                      Am 27.10.2011 21:01, schrieb Jason Erickson:

                                      3) Use the Link Cache
                                      > Invalidation mechanism to indicate to the private cache that
                                      > /articles and /articles/1/comments should be invalidated.
                                      >
                                      > Since the private cache is implemented by a browser or a library
                                      > (ostensibly) known to the client, the client can know whether it's
                                      > private cache understands the Link Cache Invalidation mechanism. If,
                                      > in the future, the Link Cache Invalidation mechanism is accepted and
                                      > adopted, you could start to supplement option 2 (client
                                      > responsibility) with option 3 (cache manager responsibility),
                                      > although you couldn't replace 2 with 3 entirely. The client could
                                      > tell whether it needed to take responsibility for it or not. Even if
                                      > the client had to take responsibility, the documentation could
                                      > indicate refer to the Link Cache Invalidation documentation and the
                                      > client could use that to implement option 2.

                                      In a browser environment you can access the reponse headers of
                                      XMLHttpRequest reponses. Thus you can implement LIC there. Depending on
                                      you use case this might be a valid approach.

                                      Philipp Meier
                                      --
                                      404 signature not found
                                    Your message has been successfully submitted and would be delivered to recipients shortly.