Loading ...
Sorry, an error occurred while loading the content.

Re: [json] Re: JSON -- rfc question -- Encoding

Expand Messages
  • Michael Schwarz
    I have one more question, do I need to convert unicode characters to something like u12345 ? Michael On 13 Apr 2007 15:47:13 -0700, Douglas Crockford
    Message 1 of 8 , Apr 13, 2007
    • 0 Attachment
      I have one more question, do I need to convert unicode characters to
      something like "\u12345"?

      Michael


      On 13 Apr 2007 15:47:13 -0700, Douglas Crockford <douglas@...>
      wrote:
      >
      > --- In json@yahoogroups.com <json%40yahoogroups.com>, "json_is_clever"
      > <vendor@...> wrote:
      >
      > > The implementer claims that because Unicode is not a binary encoding,
      > > but a set of codepoints, that this sentence means that Unicode
      > > codepoints should be used in some manner to represent characters, but
      > > that the bitstream that represents the JSON text could use escapes for
      > > most of the characters (reducing the size of the character
      > > repertoire), and then be encoded in, say, EBCDIC. While it is
      > > possible to conceive of doing such a thing, I question if his
      > > interpretation is valid.
      >
      > No, that is a wild misreading of the RFC. The JSON text must be
      > represented in Unicode, and the preferred encoding is UTF-8.
      >
      > > 1) Must a conformant JSON parser recognize all legal encodings? Or
      > > can a parser be conformant with documented restrictions on what
      > > encodings it can accept?
      >
      > Parties can agree on what is acceptable and meaningful. For example,
      > it is reasonable for a receiver to put limits on message length or
      > string length or nesting depth.
      >
      > > 2) Is a conformant JSON generator allowed to have options to generate
      > > text that is not quite JSON conformant? Or must it be completely
      > > limited to producing JSON legal JSON text only?
      >
      > A JSON generator may only produce valid JSON text.
      >
      >
      >


      [Non-text portions of this message have been removed]
    • Mark Miller
      ... For example, JSON in E-0.9 accepts only Unicode characters from the basic multilingual plane , i.e., characters whose code point fits in 16 bits. As I
      Message 2 of 8 , Apr 13, 2007
      • 0 Attachment
        Douglas Crockford wrote:
        >> 1) Must a conformant JSON parser recognize all legal encodings? Or
        >> can a parser be conformant with documented restrictions on what
        >> encodings it can accept?
        >
        > Parties can agree on what is acceptable and meaningful. For example,
        > it is reasonable for a receiver to put limits on message length or
        > string length or nesting depth.

        For example, JSON in E-0.9 accepts only Unicode characters from the "basic
        multilingual plane", i.e., characters whose code point fits in 16 bits. As I
        read the JSON spec, this is an allowable restriction. This restriction is
        needed since E-0.9 does not support Unicode supplementary characters, i.e.,
        characters whose code points are >= 2**16.

        --
        Text by me above is hereby placed in the public domain

        Cheers,
        --MarkM
      • json_is_clever
        ... Thanks for the clarification, Douglas.
        Message 3 of 8 , Apr 14, 2007
        • 0 Attachment
          --- In json@yahoogroups.com, "Douglas Crockford" <douglas@...> wrote:

          > No, that is a wild misreading of the RFC. The JSON text must be
          > represented in Unicode, and the preferred encoding is UTF-8.

          Thanks for the clarification, Douglas.
        • Douglas Crockford
          ... No. The u notation is only required for some of the control characters.
          Message 4 of 8 , Apr 14, 2007
          • 0 Attachment
            --- In json@yahoogroups.com, "Michael Schwarz" <michael.schwarz@...>
            wrote:
            >
            > I have one more question, do I need to convert unicode characters to
            > something like "\u12345"?

            No. The \u notation is only required for some of the control characters.
          • Michael Schwarz
            Hi, currently I have only something like r, n or t, are those chars allowed? Which control chars are you talking about? Thanks a lot. Michael ... -- Best
            Message 5 of 8 , Apr 14, 2007
            • 0 Attachment
              Hi,

              currently I have only something like \r, \n or \t, are those chars
              allowed? Which control chars are you talking about? Thanks a lot.

              Michael



              On 14 Apr 2007 06:45:58 -0700, Douglas Crockford <douglas@...> wrote:
              > --- In json@yahoogroups.com, "Michael Schwarz" <michael.schwarz@...>
              > wrote:
              > >
              > > I have one more question, do I need to convert unicode characters to
              > > something like "\u12345"?
              >
              > No. The \u notation is only required for some of the control characters.
              >
              >


              --
              Best regards | Schöne Grüße
              Michael

              Microsoft MVP - Most Valuable Professional
              Microsoft MCAD - Certified Application Developer

              http://weblogs.asp.net/mschwarz/
              http://www.ajaxpro.info/

              WPF/E: http://groups.google.com/group/wpf-everywhere

              Skype: callto:schwarz-interactive
              MSN IM: passport@...
            • Douglas Crockford
              ... None of the control characters can appear in JSON strings. You can use the u convention to represent them. A few of them, such as linefeed and tab, have
              Message 6 of 8 , Apr 14, 2007
              • 0 Attachment
                --- In json@yahoogroups.com, "Michael Schwarz" <michael.schwarz@...>
                wrote:

                > currently I have only something like \r, \n or \t, are those chars
                > allowed? Which control chars are you talking about? Thanks a lot.

                None of the control characters can appear in JSON strings. You can use
                the \u convention to represent them. A few of them, such as linefeed
                and tab, have more compact representations such as \n and \t. Take
                your pick.
              Your message has been successfully submitted and would be delivered to recipients shortly.