Loading ...
Sorry, an error occurred while loading the content.

Re: [json] Stoppable SAX-like interface for streaming input of JSON text

Expand Messages
  • Tatu Saloranta
    ... Ah ok. Maybe I misread your comment: if you meant it s lower level of abstraction than a tree model, yes. I meant to say that SAX(-like) API is at similar
    Message 1 of 12 , Feb 6, 2009
    • 0 Attachment
      On Thu, Feb 5, 2009 at 4:52 PM, Fang Yidong <fangyidong@...> wrote:
      >
      > On Wed, Feb 4, 2009 at 11:28 PM, Fang Yidong <fangyidong@yahoo. com.cn> wrote:
      >
      >> > Well, if I am right, the parser in your example is essentially a lexer, with slightly higher abstraction.

      >> Not really. You may want to read a bit on Stax API for Java: I assume
      >
      > I did go through JSR 173 before comparing. I mean, the StAX parser in your example is similar to a lexer such as org.json.simple.parser.Yylex:

      Ah ok. Maybe I misread your comment: if you meant it's lower level of
      abstraction than a tree model, yes.
      I meant to say that SAX(-like) API is at similar level of abstraction
      as Stax(-like).
      ...
      > I mean users may choose among DOM-like, SAX-like and StAX-like parsers.

      Ok yes, that makes sense.

      (although DOM is technically not a parser but a tree model built on
      top of that, but many users still call it a parser)

      >> > stoppable SAX-like interface provides a new option to the user.
      >
      >> As to stoppability of push parsing, the usual method so far has been
      >> to throw an exception from the event handler. Would that not have
      >> worked?
      >
      > Not really. Here stoppable also means it's resumable. That is, the user can pause at a point, doing other works, and then resume parsing or stop. Please refer the example for detail:

      Ok. That makes more sense then, thank you for pointing this out.

      ...
      > Actually, JSR 173(jsr173_07.pdf) argues it has advantages over SAX because:
      >
      > One drawback to the SAX API is that the programmer must keep track of the
      ...
      > iteratively process it. Another drawback to SAX is that the entire document needs to be
      > parsed at one time.
      >
      > The first drawback is partly true but in complex scenarios, a user with StAX parser
      > may also need to keep track of states such as nesting levels, parent-child
      > relationships and so on.

      Maybe, but not necessarily, because this information if implicit
      within call stack (except for having to track end markers).
      That is, it's a recursive-descent kind of approach where you know
      where you came from, usually without additional tracking of location.
      Code branches based on constructs encountered.

      > The purpose of JSON.simple's stoppable SAX-like interface is to help relieve such
      > issues.

      Ok.

      -+ Tatu +-
    • Fang Yidong
      ... Yes, it s convenient. But I think it may result in a call stack based processor instead of a heap based one, right? The former will cause stack overflow
      Message 2 of 12 , Feb 6, 2009
      • 0 Attachment
        > > The first drawback is partly true but in complex scenarios, a user with StAX parser

        > > may also need to keep track of states such as nesting levels, parent-child

        > > relationships and so on.


        > Maybe, but not necessarily, because this information if implicit

        > within call stack (except for having to track end markers).

        > That is, it's a recursive-descent kind of approach where you know

        > where you came from, usually without additional tracking of location.

        > Code branches based on constructs encountered.


        Yes, it's convenient. But I think it may result in a call stack based processor instead of a heap based one, right? The former will cause stack overflow issues in a deep nesting level. Here's a heap based processor for building object graph with SAX-like interface:

        http://code.google.com/p/json-simple/wiki/DecodingExamples#Example_6_-_Build_whole_object_graph_on_top_of_SAX-like_content



        ___________________________________________________________
        好玩贺卡等你发,邮箱贺卡全新上线!
        http://card.mail.cn.yahoo.com/
      • Mark Joseph
        I was reading over the StAX specification and BEA provides licenses to the API, but that license prevents sublicenses. This means I as a vendor cannot provide
        Message 3 of 12 , Feb 7, 2009
        • 0 Attachment
          I was reading over the StAX specification and BEA provides
          licenses to the API, but that license prevents
          sublicenses. This means I as a vendor cannot provide my
          own implementation and license that to customers. So if
          I am reading that right what is the point of that
          standard?
          We at P6R provide JSON and XML tools (amoung others), but
          if the standard has restrictions on it then its not a real
          standard that we can use.

          Mark
          P6R, Inc


          On Thu, 5 Feb 2009 15:28:41 +0800 (CST)
          Fang Yidong <fangyidong@...> wrote:
          > Well, if I am right, the parser in your example is
          >essentially a lexer, with slightly higher abstraction.
          >
          > It's true that it's convenient to control in simple
          >case. But in a slightly more complex scenario, such as
          >retrieving data in some desired location (for example,
          >'/store/book[1]/title' in XPath expression), I don't
          >think the code using a SAX(-like) parser is much more
          >complex than using a StAX(-like) parser.
          >
          > Besides easier to pipeline, a SAX(-like) parser requires
          >smaller memory footprint and is faster, and the stoppable
          >SAX-like interface introduced by JSON.simple avoids the
          >drawback that a traditional SAX parser requires the
          >entire document to be parsed to get a simple data.
          >
          > I think different applications require different
          >abstraction levels. JSON.simple's stoppable SAX-like
          >interface provides a new option to the user. It's your
          >choice of adopting it or not.
          > 发件人: Tatu Saloranta <tsaloranta@...>
          > 主题: Re: [json] Stoppable SAX-like interface for
          >streaming input of JSON text
          > 收件人: json@yahoogroups.com
          > 日期: 2009,25,周四,1:22上午
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          > On Tue, Feb 3, 2009 at 8:09 PM, Fang Yidong
          ><fangyidong@yahoo. com.cn> wrote:
          >
          >> JSON.simple introduces a simplified and stoppable
          >>SAX-like content handler to process JSON text stream.
          >>Please take a look if you are interested in it:
          >
          >>
          >
          >
          >
          > If you are interested in application code controlling
          >parsing, why not
          >
          > just use Stax(-like) pull interface? Code example given
          >would be quite
          >
          > a bit simpler with "pull" approach; essentially little
          >more than
          >
          > recursive descent, or with some interfaces, linear
          >iteration like:
          >
          >
          >
          > ---
          >
          > JsonParser jp = factory.createJsonP arser(input) ;
          >
          > JsonToken t;
          >
          >
          >
          > while ((t = jp.nextToken( )) != null) {
          >
          > if (t == JsonToken.FIELD_ NAME && "id".equals(
          >t.getCurrentName ())) {
          >
          > break;
          >
          > }
          >
          > }
          >
          > if (t != null) { // get value for the field
          >
          > t = jp.nextToken( );
          >
          > System.out.println( "found id, value: "+jp.getText( ));
          >
          > }
          >
          > ---
          >
          >
          >
          > And you could obviously built simpler abstractions for
          >matching on top of this.
          >
          >
          >
          > The main benefit of push-interface like SAX is that it
          >is easier to
          >
          > pipeline multiple processing stages. Otherwise it is
          >rather cumbersome
          >
          > and inconvenient way to process data that naturally
          >comes in
          >
          > well-defined and structured order.
          >
          >
          >
          > I am asking because oftentimes xml/json/whatever parser
          >writers use
          >
          > SAX-like approaches without knowing that it's only way
          >to slice and
          >
          > dice data, and often not the best.
          >
          >
          >
          > -+ Tatu +-
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          >
          > ___________________________________________________________
          > 好玩贺卡等你发,邮箱贺卡全新上线!
          > http://card.mail.cn.yahoo.com/
          >
          > [Non-text portions of this message have been removed]
          >

          -------------------------
          Mark Joseph, Ph.D.
          President and Secretary
          P6R, Inc.
          http://www.p6r.com
          408-205-0361
          Fax: 831-476-7490
          Skype: markjoseph_sc
          IM: (Yahoo) mjoseph8888
          (AIM) mjoseph8888
        • Tatu Saloranta
          ... I don t see why you would need a license to implement an API. Generally licensing governs usage of API itself, distributing it, modifying etc. None of
          Message 4 of 12 , Feb 7, 2009
          • 0 Attachment
            On Sat, Feb 7, 2009 at 4:25 PM, Mark Joseph <mark@...> wrote:
            > I was reading over the StAX specification and BEA provides
            > licenses to the API, but that license prevents
            > sublicenses. This means I as a vendor cannot provide my
            > own implementation and license that to customers. So if

            I don't see why you would need a license to implement an API.
            Generally licensing governs usage of API itself, distributing it, modifying etc.
            None of those are usually needed, because Stax is part of JDK 1.6.
            Or you point users to download API jar itself from whoever can provide it.

            Also: whatever stax specs download bundle claims is probably incorrect.

            But yes, clearly BEA screwed up licensing mentions and other parts.

            > I am reading that right what is the point of that
            > standard?
            > We at P6R provide JSON and XML tools (amoung others), but
            > if the standard has restrictions on it then its not a real
            > standard that we can use.

            Just to be clear: Stax API itself has little to do with Json. It is a
            Java xml processing API, and would be of little help for Json. There's
            no point in trying to implement it, due to fundamental differences
            between xml and json data formats.

            But similar style ("pull parsing") is useful.

            -+ Tatu +-
          • Tatu Saloranta
            ... Yes, if your document has nesting level of about million or so. :-D So I don t think that is a practical concern. If it happens to be, then one can
            Message 5 of 12 , Feb 7, 2009
            • 0 Attachment
              On Fri, Feb 6, 2009 at 5:33 PM, Fang Yidong <fangyidong@...> wrote:
              >
              >> Maybe, but not necessarily, because this information if implicit
              >> within call stack (except for having to track end markers).
              >
              >> That is, it's a recursive-descent kind of approach where you know
              >> where you came from, usually without additional tracking of location.
              >> Code branches based on constructs encountered.
              >
              > Yes, it's convenient. But I think it may result in a call stack based processor instead of a heap based
              > one, right? The former will cause stack overflow issues in a deep nesting level. Here's a heap

              Yes, if your document has nesting level of about million or so. :-D
              So I don't think that is a practical concern.

              If it happens to be, then one can construct explicit stack, similar to
              how one has to do it with SAX-like interfaces.

              > based processor for building object graph with SAX-like interface:
              > http://code.google.com/p/json-simple/wiki/DecodingExamples#Example_6_-_Build_whole_object_graph_on_top_of_SAX-like_content

              Right: that builds "poor man's object binding", List/Map/primitive
              structure from Json.
              Most Json parsers offer that functionality via API, so it need not be
              built from low-level components (json.org and others).
              Code with pull API would be quite similar, although one could choose
              between recursion and iteration with explicit stack.

              -+ Tatu +-
            • Mark Joseph
              Ah I am sorry I was not clear we provide the JSON and XML tools to C++ users not Java. http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/ On
              Message 6 of 12 , Feb 7, 2009
              • 0 Attachment
                Ah I am sorry I was not clear we provide the JSON and XML
                tools to C++ users not Java.
                http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/




                On Sat, 7 Feb 2009 19:10:21 -0800
                Tatu Saloranta <tsaloranta@...> wrote:
                > On Sat, Feb 7, 2009 at 4:25 PM, Mark Joseph
                ><mark@...> wrote:
                >> I was reading over the StAX specification and BEA
                >>provides
                >> licenses to the API, but that license prevents
                >> sublicenses. This means I as a vendor cannot provide my
                >> own implementation and license that to customers. So
                >>if
                >
                > I don't see why you would need a license to implement an
                >API.
                > Generally licensing governs usage of API itself,
                >distributing it, modifying etc.
                > None of those are usually needed, because Stax is part
                >of JDK 1.6.
                > Or you point users to download API jar itself from
                >whoever can provide it.
                >
                > Also: whatever stax specs download bundle claims is
                >probably incorrect.
                >
                > But yes, clearly BEA screwed up licensing mentions and
                >other parts.
                >
                >> I am reading that right what is the point of that
                >> standard?
                >> We at P6R provide JSON and XML tools (amoung others),
                >>but
                >> if the standard has restrictions on it then its not a
                >>real
                >> standard that we can use.
                >
                > Just to be clear: Stax API itself has little to do with
                >Json. It is a
                > Java xml processing API, and would be of little help for
                >Json. There's
                > no point in trying to implement it, due to fundamental
                >differences
                > between xml and json data formats.
                >
                > But similar style ("pull parsing") is useful.
                >
                > -+ Tatu +-

                -------------------------
                Mark Joseph, Ph.D.
                President and Secretary
                P6R, Inc.
                http://www.p6r.com
                408-205-0361
                Fax: 831-476-7490
                Skype: markjoseph_sc
                IM: (Yahoo) mjoseph8888
                (AIM) mjoseph8888
              • Tatu Saloranta
                ... Ok that explains it. I shouldn t have assume it s for Java either. And it is true that for products that cover both xml and json, it is advantageous to use
                Message 7 of 12 , Feb 7, 2009
                • 0 Attachment
                  On Sat, Feb 7, 2009 at 8:35 PM, Mark Joseph <mark@...> wrote:
                  > Ah I am sorry I was not clear we provide the JSON and XML
                  > tools to C++ users not Java.
                  > http://www.p6r.com/articles/2008/05/22/a-sax-like-parser-for-json/

                  Ok that explains it. I shouldn't have assume it's for Java either.
                  And it is true that for products that cover both xml and json, it is
                  advantageous to use same or similar interfaces too. There are some
                  java libraries that do something similar, such as jettison that
                  exposes json through java xml interfaces (stax in this case).

                  -+ Tatu +-
                Your message has been successfully submitted and would be delivered to recipients shortly.