Loading ...
Sorry, an error occurred while loading the content.

Re: Universal Binary JSON Specification

Expand Messages
  • rkalla123
    Patrick, The reasons I specified UTF-8 were (1) to be consistent with the rest of the spec (e.g. you don t have to change your thinking just because you are
    Message 1 of 76 , Sep 24, 2011
    • 0 Attachment
      Patrick,

      The reasons I specified UTF-8 were (1) to be consistent with the rest of the spec (e.g. you don't have to change your thinking just because you are parsing a specific construct type) and (2) UTF-8 stores ASCII characters as 1-byte-per-char, so we aren't wasting any more space than ASCII encoding in this case (as you pointed out, the required chars are all < 127)

      As for the encoding you mentioned you are absolutely right something custom would be much more compressed, but that goes against the first rule of fig... binary spec :) (simplicity)

      I am really trying to nail that middle group right between "this binary spec is so complex, but optimized, that I question my own existence" and "this text format is so slow to convert that I want to take up crab fishing".

      -R

      --- In json@yahoogroups.com, Patrick Maupin <pmaupin@...> wrote:
      > I like the idea of "encoded generic number". But one of the purposes
      > of a binary encoding is space savings, so I don't think unicode fits
      > into that framework.
      >
      > If I understand correctly, you only need to encode the 10 digits, and
      > "+", "-", "e", "E", and "."
      >
      > That's only 15 items, and you can pack 16 items into every 4 bits.
      > Leaves you one item left over for an end marker.
      >
      > You could easily pack two characters into every byte, and could define
      > 0 to be the end marker, and define that if there are an odd number of
      > characters, then the number will be followed by 3 nybbles of 0, and if
      > there is an even number of characters, then the number will be
      > followed by 2 nybbles of 0. In that case, the end of number marker
      > will always be a byte of zero, which you can always scan for with any
      > C library.
      >
      > Regards,
      > Pat
      >
    • Tatu Saloranta
      ... For what it is worth, I also consider support for only signed values a good thing. -+ Tatu +-
      Message 76 of 76 , Feb 20, 2012
      • 0 Attachment
        On Mon, Feb 20, 2012 at 9:42 AM, rkalla123 <rkalla@...> wrote:
        > Stephan,
        >
        > No problem; your feedback are still very applicable and much appreciated.
        >
        > The additional view-point on the signed/unsigned issue was exactly what I was hoping for. My primary goal has always been simplicity and I know at least from the Java world, going with unsigned values would have made the impl distinctly *not* simple (and an annoying API).
        >
        > So I am glad to get some validation there that I am not alienating every other language at the cost of Java.

        For what it is worth, I also consider support for only signed values a
        good thing.

        -+ Tatu +-
      Your message has been successfully submitted and would be delivered to recipients shortly.