Loading ...
Sorry, an error occurred while loading the content.

Re: [json] Sorting the keys of formatted (pretty printed) JSON output

Expand Messages
  • Tatu Saloranta
    ... I understand this, and it s a valid point. But I was thinking more along the lines of deferring initialization of the Map, and switching between the two
    Message 1 of 7 , Apr 10, 2008
    • 0 Attachment
      On Thu, Apr 10, 2008 at 5:13 AM, Arthur Blake <arthur.blake@...> wrote:
      >
      > On Thu, Apr 10, 2008 at 12:29 AM, Tatu Saloranta <tsaloranta@...>
      > wrote:
      > >
      > > One question: wouldn't it be easier to just use TreeMap for storing
      > > fields, instead of HashMap, if sorting is desired?
      > >
      > > -+ Tatu +-
      >
      > That was my first thought when thinking about how to do it, as well.
      >
      > That would make sense if you wanted to always keep and use the JSON
      > sorted. In one of my main use cases (JSON-RPC debug output on the
      ...
      > There would be a penalty for keeping a TreeMap versus a HashMap.
      > HashMap obviously performs better (constant time put & get) for when

      I understand this, and it's a valid point. But I was thinking more
      along the lines of deferring initialization of the Map, and switching
      between the two if/as necessary. That is, it'd be a property of Json
      object, to minimize sorting overhead for cases where it's not needed.

      At any rate, I just wanted to mention it as an option, just to make
      sure it was considered. And that seems to have been the case.

      -+ Tatu +-
    • John Cowan
      ... Actually, it isn t. Even though the javadoc claims that get and put are constant time, a moment s reflection shows that this cannot be so. HashMap is
      Message 2 of 7 , Apr 10, 2008
      • 0 Attachment
        Tatu Saloranta scripsit:

        > > There would be a penalty for keeping a TreeMap versus a HashMap.
        > > HashMap obviously performs better (constant time put & get) for when
        >
        > I understand this, and it's a valid point.

        Actually, it isn't. Even though the javadoc claims that get and put
        are constant time, a moment's reflection shows that this cannot be so.

        HashMap is implemented (in OpenJDK, and almost certainly in earlier
        versions) using an array of specialized objects that implement Map.Entry.
        Each element of the array is one hash bucket, and the Entry objects form
        a linked list. So average get/put time is approximately N/c*2, where c
        is the capacity (initially 16 unless specified otherwise), which is O(N).
        The hope is that N/c*2 is always small enough that the performance looks
        like O(1), but as N grows, c is increased using a full rehash, which is
        again O(N) and is independent of c.

        So I'd say using TreeMap throughout, with its guaranteed O(log N)
        performance, is a perfectly reasonable thing to do; it makes debugging
        easier, it is prettier, and it guarantees getting the same output on
        the same input.

        --
        On the Semantic Web, it's too hard to prove John Cowan cowan@...
        you're not a dog. --Bill de hOra http://www.ccil.org/~cowan
      • kriszyp
        ... when ... Map.Entry. ... O(N). ... Not true. Full rehashes are indeed O(N), but they take place at a frequency proportional to 1/c because each time a
        Message 3 of 7 , Apr 10, 2008
        • 0 Attachment
          --- In json@yahoogroups.com, John Cowan <cowan@...> wrote:
          >
          > Tatu Saloranta scripsit:
          >
          > > > There would be a penalty for keeping a TreeMap versus a HashMap.
          > > > HashMap obviously performs better (constant time put & get) for
          when
          > >
          > > I understand this, and it's a valid point.
          >
          > Actually, it isn't. Even though the javadoc claims that get and put
          > are constant time, a moment's reflection shows that this cannot be so.
          >
          > HashMap is implemented (in OpenJDK, and almost certainly in earlier
          > versions) using an array of specialized objects that implement
          Map.Entry.
          > Each element of the array is one hash bucket, and the Entry objects form
          > a linked list. So average get/put time is approximately N/c*2, where c
          > is the capacity (initially 16 unless specified otherwise), which is
          O(N).
          > The hope is that N/c*2 is always small enough that the performance looks
          > like O(1), but as N grows, c is increased using a full rehash, which is
          > again O(N) and is independent of c.

          Not true. Full rehashes are indeed O(N), but they take place at a
          frequency proportional to 1/c because each time a rehash takes place
          the size is doubled, so the rehashes become farther apart the larger c
          gets. Therefore the average rehash time for a put operation is
          O(N*1/c) which averages to constant time. Both get and put only have
          to iterate through their own bucket, so as long as the hash method is
          reasonable, both get and put are indeed O(1), constant time operations.
          Kris
        • Douglas Crockford
          ... Excellent suggestion. I have modified JSON.Object.
          Message 4 of 7 , Apr 10, 2008
          • 0 Attachment
            --- In json@yahoogroups.com, "Arthur Blake" <arthur.blake@...> wrote:
            > When I have machine generated JSON data that is stored in files, I
            > like to have the JSON data be sorted alphabetically by the keys in the
            > output.
            >
            > It's much easier to find what you are looking for quickly this way
            > when viewing or editing the data (especially when the amount of data
            > is large.)
            >
            > Because there is no defined order of the keys in JSON, it doesn't
            > matter what order you put them in. So we might as well put it in an
            > order that makes it more human readable in contexts where the data is
            > meant to be viewed or edited by humans.

            Excellent suggestion. I have modified JSON.Object.
          Your message has been successfully submitted and would be delivered to recipients shortly.