Loading ...
Sorry, an error occurred while loading the content.
 

Denormalizing Data In To Collection Resources?

Expand Messages
  • Jim Purbrick
    We re building what will hopefully be a fairly RESTful API with hyperlinked resources and collection resources representing one to many relationships. The
    Message 1 of 5 , Nov 10, 2011
      We're building what will hopefully be a fairly RESTful API with
      hyperlinked resources and collection resources representing one to
      many relationships.

      The collection resources always contain canonical URIs for each
      element, but what else?

      Often authors of client apps would also like names, so the members of
      the collection can be presented in a list to the user and selected
      from. Then portraits of the members are requested, then arbitrary data
      from the full representation is requested in the collection so that N
      hundred HTTP requests don't have to be made for a particular feature.

      Is there any good research trading off the chattiness, cachability,
      latency and redundancy of denormalized data in RESTful collection
      resources? Are there good rules of thumb to apply here. It's tempting
      to say do the N hundred HTTP requests and come back when you can show
      it's a problem, but that doesn't go down well...

      Thanks,

      Jim
    • Jan Algermissen
      Hi Jim, ... I think your use case description is a little, umm, dense. Can you illustrate what you are doing with example interactions? Jan
      Message 2 of 5 , Nov 10, 2011
        Hi Jim,


        On Nov 10, 2011, at 2:22 PM, Jim Purbrick wrote:

        > We're building what will hopefully be a fairly RESTful API with
        > hyperlinked resources and collection resources representing one to
        > many relationships.
        >
        > The collection resources always contain canonical URIs for each
        > element, but what else?
        >
        > Often authors of client apps would also like names, so the members of
        > the collection can be presented in a list to the user and selected
        > from. Then portraits of the members are requested, then arbitrary data
        > from the full representation is requested in the collection so that N
        > hundred HTTP requests don't have to be made for a particular feature.
        >
        > Is there any good research trading off the chattiness, cachability,
        > latency and redundancy of denormalized data in RESTful collection
        > resources? Are there good rules of thumb to apply here. It's tempting
        > to say do the N hundred HTTP requests and come back when you can show
        > it's a problem, but that doesn't go down well...
        >

        I think your use case description is a little, umm, dense.

        Can you illustrate what you are doing with example interactions?

        Jan


        > Thanks,
        >
        > Jim
        >
      • Mike Kelly
        A media type like HAL[1] is designed for linking to and embedding resources via hypertext. It doesn t force you to model everything as a collection but you can
        Message 3 of 5 , Nov 10, 2011
          A media type like HAL[1] is designed for linking to and embedding
          resources via hypertext. It doesn't force you to model everything as a
          collection but you can definitely use it for that purpose.

          Cacheability will always take a hit when you introduce composite
          resources because their volatility is likely to be higher. You can
          mitigate these effects (at least for reverse proxy caches on the
          server side) via mechanisms like Linked Cache Invalidation[2] and Edge
          Side Includes[3].

          Cheers,
          Mike

          [1] http://stateless.co/hal_specification.html
          [2] http://tools.ietf.org/html/draft-nottingham-linked-cache-inv-00
          [3] http://en.wikipedia.org/wiki/Edge_Side_Includes


          On Thu, Nov 10, 2011 at 1:22 PM, Jim Purbrick <jimpurbrick@...> wrote:
          > We're building what will hopefully be a fairly RESTful API with
          > hyperlinked resources and collection resources representing one to
          > many relationships.
          >
          > The collection resources always contain canonical URIs for each
          > element, but what else?
          >
          > Often authors of client apps would also like names, so the members of
          > the collection can be presented in a list to the user and selected
          > from. Then portraits of the members are requested, then arbitrary data
          > from the full representation is requested in the collection so that N
          > hundred HTTP requests don't have to be made for a particular feature.
          >
          > Is there any good research trading off the chattiness, cachability,
          > latency and redundancy of denormalized data in RESTful collection
          > resources? Are there good rules of thumb to apply here. It's tempting
          > to say do the N hundred HTTP requests and come back when you can show
          > it's a problem, but that doesn't go down well...
          >
          > Thanks,
          >
          > Jim
          >
          >
          > ------------------------------------
          >
          > Yahoo! Groups Links
          >
          >
          >
          >
        • mike amundsen
          Jim: I don t have any research results (interesting area....), but will pass along my own experience and personal preferences in case they give you some
          Message 4 of 5 , Nov 10, 2011
            Jim:

            I don't have any research results (interesting area....), but will
            pass along my own experience and personal preferences in case they
            give you some helpful ideas.

            In cases where some type of "composite" view is needed by clients, I
            prefer to do this work on the server and present a single "resource"
            to clients (long lists would support paging, filtering, etc.). By
            doing the "mashup" on the server, there are more opportunities to
            optimize the experience in the future (the server can change storage
            models, object models, re-arrange code, move operations to other
            servers, etc. all w/o adversely affecting the client).

            Also, by setting up an expectation that clients will "get what they
            need" in a single call, you can lead server implementations down the
            path of publicizing a resource model that reflects the actual
            domain-specific needs of the client-server interaction instead of
            publicizing a resource model based on the server-side data storage or
            coding object models. This does a better job of separating concerns,
            too.

            Finally, since the HTTP protocol has a rich set of caching controls,
            much of the "cost" of chunky messages (and the effort to compose them
            on the server) can be mitigated w/ a cache directives sent along with
            the response. Even composite resources that experience heavy editing
            will do well in this cache/chunky environment by adding etags to the
            cache controls.

            On a related note, Jon Moore's 2010 presentation at Oredev[1] shows an
            approach that allows servers to implement resource messages that can
            be either chunky or chatty and allows clients to sort details out on
            the fly. A rather interesting approach since it allows implementations
            to safely "experiment" w/ optimizing the interactions "in real time."

            Hope this helps.

            MCA

            [1] http://oredev.org/2010/sessions/hypermedia-apis
            mca
            http://amundsen.com/blog/
            http://twitter.com@mamund
            http://mamund.com/foaf.rdf#me





            On Thu, Nov 10, 2011 at 08:22, Jim Purbrick <jimpurbrick@...> wrote:
            > We're building what will hopefully be a fairly RESTful API with
            > hyperlinked resources and collection resources representing one to
            > many relationships.
            >
            > The collection resources always contain canonical URIs for each
            > element, but what else?
            >
            > Often authors of client apps would also like names, so the members of
            > the collection can be presented in a list to the user and selected
            > from. Then portraits of the members are requested, then arbitrary data
            > from the full representation is requested in the collection so that N
            > hundred HTTP requests don't have to be made for a particular feature.
            >
            > Is there any good research trading off the chattiness, cachability,
            > latency and redundancy of denormalized data in RESTful collection
            > resources? Are there good rules of thumb to apply here. It's tempting
            > to say do the N hundred HTTP requests and come back when you can show
            > it's a problem, but that doesn't go down well...
            >
            > Thanks,
            >
            > Jim
            >
            >
            > ------------------------------------
            >
            > Yahoo! Groups Links
            >
            >
            >
            >
          • Erik Wilde
            hello ... you re talking about feeds here, or their logical equivalent, including the options for embedding or referencing entries, right? you can feed those
            Message 5 of 5 , Nov 10, 2011
              hello

              On 2011-11-10 6:09 , mike amundsen wrote:
              > In cases where some type of "composite" view is needed by clients, I
              > prefer to do this work on the server and present a single "resource"
              > to clients (long lists would support paging, filtering, etc.). By
              > doing the "mashup" on the server, there are more opportunities to
              > optimize the experience in the future (the server can change storage
              > models, object models, re-arrange code, move operations to other
              > servers, etc. all w/o adversely affecting the client).

              you're talking about feeds here, or their logical equivalent, including
              the options for embedding or referencing entries, right? you can feed
              those lists, you know the work we've been doing on feeds as query result
              serializations. or am i missing something that you are doing that does
              fit this pattern?

              > Also, by setting up an expectation that clients will "get what they
              > need" in a single call, you can lead server implementations down the
              > path of publicizing a resource model that reflects the actual
              > domain-specific needs of the client-server interaction instead of
              > publicizing a resource model based on the server-side data storage or
              > coding object models. This does a better job of separating concerns,
              > too.

              so, another question i've been pondering is the following: it's clear to
              me that we must have collection resources that provide aggregate views
              of potentially included resources. but how much control should we give
              the client over what we return? that impacts cacheability, but for us is
              pretty much the only way we can make certain scenarios work. could we
              add query parameters so that client can control what to include in the
              collection resource, or can we allow clients to configure "view"
              resources which define these aspects and then are being referred to in
              requests? these view resources could control things like the following
              aspects:

              - collection paging, let's say 20 per page.
              - inlining or linking entry resources
              - included attributes of the entries (we have many many attribiutes per
              resource and most clients only need very few of them)

              so what i wondering about is whether our "feed queries" work could be
              augmented with some "feed views" work. i think i would lean towards the
              model where the feed view configuration would be a self-describing
              resource itself, but generally speaking, i am wondering whether this is
              the model you have in mind or have already implemented.

              cheers,

              dret.

              --
              erik wilde | mailto:dret@... - tel:+1-510-2061079 |
              | UC Berkeley - School of Information (ISchool) |
              | http://dret.net/netdret http://twitter.com/dret |
            Your message has been successfully submitted and would be delivered to recipients shortly.