Loading ...
Sorry, an error occurred while loading the content.

Tool for testing lexers/parsers/tree parsers

Expand Messages
  • Bogdan Mitu
    Hi all, I worked on a project using ANTLR default ASTs (speed was not a concern), and I wrote a small utility to help me test the tree parsers. Later I
    Message 1 of 6 , Jan 7, 2002
    • 0 Attachment
      Hi all,

      I worked on a project using ANTLR default ASTs (speed was not a concern),
      and I wrote a small utility to help me test the tree parsers. Later I
      extended it to allow testing of parsers and lexers as well. I would like to
      contribute this utility to ANTLR; I put the files in the ANTLR group
      http://groups.yahoo.com/group/antlr-interest/files/tester.zip

      A test file consists of a header and a series of tests. The header contains
      at least the class name, the type (lexer, parser or tree parser) and the
      rule to be tested (for parsers/tree parsers). The tests are pairs input -
      expected output. The tester takes the input, feed the tool under test and
      compare the output with the expected one. Tests that fail are displayed in
      pop-up frames.

      To run the tool you have to type java test.TestRunner <filename>. In the
      root of the distribution you can find some .bat files for invoking a lexer,
      a parser and a tree parser. In the "example" folder there is a sample
      grammar bool.g and some sample tests (lexer.test, parser.test and
      treeParser.test). Note that in each file the last test is wrong, just to
      force frames to pop-up.

      If you decide to give it a try, please let me know your comments.


      Regards,




      __________________________________________________
      Do You Yahoo!?
      Send FREE video emails in Yahoo! Mail!
      http://promo.yahoo.com/videomail/
    • mzukowski@bco.com
      There is a potential problem in testing specific rules in a parser in that the EOF token may not be part of the token sets used for lookahead in decisions.
      Message 2 of 6 , Jan 10, 2002
      • 0 Attachment
        There is a potential problem in testing specific rules in a parser in that
        the EOF token may not be part of the token sets used for lookahead in
        decisions. This may also be a problem for "protected" lexer rules. If you
        find such a problem I would recommend simply writing a new rule with the EOF
        included. For example:

        exprTestRule: expr EOF;

        In fact I would recommend doing this for any rules you are going to test to
        make sure that all of your test input is used. Otherwise you might only
        parse as much as the rule wants to see at the same time you would be
        assuming that your entire input has been parsed.

        I am not actively developing any big grammars right now, but I suspect that
        the addition of rules such as the one above may have an effect on the
        warning messages or may even bring in a new ambiguity.

        I would be very curious to hear how this works in practice for people. One
        approach to not cluttering up your grammar would be to "subclass" in order
        to add your own test rules. Another approach would be the one I took with
        the GCC C grammar, which is to have your tests be complete programs to
        parse.

        Thanks Bogdan, for a useful tool!

        Monty

        > -----Original Message-----
        > From: Bogdan Mitu [mailto:bogdan_mt@...]
        > Sent: Monday, January 07, 2002 7:11 AM
        > To: antlr-interest@yahoogroups.com
        > Cc: parrt@...
        > Subject: [antlr-interest] Tool for testing lexers/parsers/tree parsers
        >
        >
        > Hi all,
        >
        > I worked on a project using ANTLR default ASTs (speed was not
        > a concern),
        > and I wrote a small utility to help me test the tree parsers. Later I
        > extended it to allow testing of parsers and lexers as well. I
        > would like to
        > contribute this utility to ANTLR; I put the files in the ANTLR group
        > http://groups.yahoo.com/group/antlr-interest/files/tester.zip
        >
        > A test file consists of a header and a series of tests. The
        > header contains
        > at least the class name, the type (lexer, parser or tree
        > parser) and the
        > rule to be tested (for parsers/tree parsers). The tests are
        > pairs input -
        > expected output. The tester takes the input, feed the tool
        > under test and
        > compare the output with the expected one. Tests that fail are
        > displayed in
        > pop-up frames.
        >
        > To run the tool you have to type java test.TestRunner
        > <filename>. In the
        > root of the distribution you can find some .bat files for
        > invoking a lexer,
        > a parser and a tree parser. In the "example" folder there is a sample
        > grammar bool.g and some sample tests (lexer.test, parser.test and
        > treeParser.test). Note that in each file the last test is
        > wrong, just to
        > force frames to pop-up.
        >
        > If you decide to give it a try, please let me know your comments.
        >
        >
        > Regards,
        >
        >
        >
        >
        > __________________________________________________
        > Do You Yahoo!?
        > Send FREE video emails in Yahoo! Mail!
        > http://promo.yahoo.com/videomail/
        >
        >
        >
        > Your use of Yahoo! Groups is subject to
        > http://docs.yahoo.com/info/terms/
        >
        >
        >
      • bob mcwhirter
        ... Would it be possible, though, to test normal rules, without the EOF, and then check that nextToken() produces EOF? ie: parser.exprRule(); assertEquals(
        Message 3 of 6 , Jan 10, 2002
        • 0 Attachment
          On Thu, 10 Jan 2002 mzukowski@... wrote:

          > There is a potential problem in testing specific rules in a parser in that
          > the EOF token may not be part of the token sets used for lookahead in
          > decisions. This may also be a problem for "protected" lexer rules. If you
          > find such a problem I would recommend simply writing a new rule with the EOF
          > included. For example:
          >
          > exprTestRule: expr EOF;
          >
          > In fact I would recommend doing this for any rules you are going to test to
          > make sure that all of your test input is used. Otherwise you might only
          > parse as much as the rule wants to see at the same time you would be
          > assuming that your entire input has been parsed.

          Would it be possible, though, to test normal rules, without the EOF, and
          then check that nextToken() produces EOF?

          ie:

          parser.exprRule();
          assertEquals( TokenTypes.EOF, lexer.nextToken() );

          This is at least what I've done to test hand-written recursive descent
          parsers and lexers (for SAXPath).

          -bob
        • mzukowski@bco.com
          Jonathan Bachrach posted a question here on 8/31/01 entitled single identifier as java expression . The problem was that calling the java grammar s expr rule
          Message 4 of 6 , Jan 10, 2002
          • 0 Attachment
            Jonathan Bachrach posted a question here on 8/31/01 entitled "single
            identifier as java expression". The problem was that calling the java
            grammar's expr rule would not work if the input was just one identifier. I
            believe that I remember him saying the solution was to make a separate rule
            to parse an expr followed by EOF. The Java grammar generates this code in
            unaryExpressionNotPlusMinus:
            ...
            if (synpred handled here)
            {...
            }
            else if ((_tokenSet_19.member(LA(1))) && (_tokenSet_20.member(LA(2)))) {
            postfixExpression();
            if (inputState.guessing==0) {
            astFactory.addASTChild(currentAST, returnAST);
            }
            }
            else {
            throw new NoViableAltException(LT(1), getFilename());
            }

            _tokenSet_20 doesn't have EOF in it, and you will get a NoViableAltException
            thrown. I found this out by regenerating with options
            {codeGenBitsetTestThreshold = 999999;} which forces no token sets to be
            generated unless they have more than 999999 elements.

            So the answer is no, you may not always be able to test individual rules if
            they are not expecting an EOF to follow.

            Monty

            > -----Original Message-----
            > From: bob mcwhirter [mailto:bob@...]
            > Sent: Thursday, January 10, 2002 1:00 PM
            > To: antlr-interest@yahoogroups.com
            > Subject: RE: [antlr-interest] Tool for testing lexers/parsers/tree
            > parsers
            >
            >
            > On Thu, 10 Jan 2002 mzukowski@... wrote:
            >
            > > There is a potential problem in testing specific rules in a
            > parser in that
            > > the EOF token may not be part of the token sets used for
            > lookahead in
            > > decisions. This may also be a problem for "protected"
            > lexer rules. If you
            > > find such a problem I would recommend simply writing a new
            > rule with the EOF
            > > included. For example:
            > >
            > > exprTestRule: expr EOF;
            > >
            > > In fact I would recommend doing this for any rules you are
            > going to test to
            > > make sure that all of your test input is used. Otherwise
            > you might only
            > > parse as much as the rule wants to see at the same time you would be
            > > assuming that your entire input has been parsed.
            >
            > Would it be possible, though, to test normal rules, without
            > the EOF, and
            > then check that nextToken() produces EOF?
            >
            > ie:
            >
            > parser.exprRule();
            > assertEquals( TokenTypes.EOF, lexer.nextToken() );
            >
            > This is at least what I've done to test hand-written recursive descent
            > parsers and lexers (for SAXPath).
            >
            > -bob
            >
            >
            >
            >
            > Your use of Yahoo! Groups is subject to
            http://docs.yahoo.com/info/terms/
          • Bogdan Mitu
            Hi Monty, Thanks for your feedback. See my comments in text. ... I will try to avoid rewriting tested grammars. To make sure that all input has been parsed, at
            Message 5 of 6 , Jan 16, 2002
            • 0 Attachment
              Hi Monty,

              Thanks for your feedback. See my comments in text.

              --- mzukowski@... wrote:
              > There is a potential problem in testing specific rules in a parser in that
              > the EOF token may not be part of the token sets used for lookahead in
              > decisions. This may also be a problem for "protected" lexer rules. If
              > you
              > find such a problem I would recommend simply writing a new rule with the
              > EOF
              > included. For example:
              >
              > exprTestRule: expr EOF;
              >
              > In fact I would recommend doing this for any rules you are going to test
              > to
              > make sure that all of your test input is used. Otherwise you might only
              > parse as much as the rule wants to see at the same time you would be
              > assuming that your entire input has been parsed.

              I will try to avoid rewriting tested grammars.
              To make sure that all input has been parsed, at the end of a test I should:
              1. Check that all tokens have been requested
              2. The token queue of the parser is empty (to be sure that tokens were not
              read only in guessing mode.

              There still remains a problem, of course, because some rules NEED trailing
              tokens for LA. I will add a keyword to distinguish tokens that need to be
              consumed from tokens only needed in LA.

              Something similar needs to be done for lexers too.


              Regards,
              Bogdan


              > I am not actively developing any big grammars right now, but I suspect
              > that
              > the addition of rules such as the one above may have an effect on the
              > warning messages or may even bring in a new ambiguity.
              >
              > I would be very curious to hear how this works in practice for people.
              > One
              > approach to not cluttering up your grammar would be to "subclass" in order
              > to add your own test rules. Another approach would be the one I took with
              > the GCC C grammar, which is to have your tests be complete programs to
              > parse.
              >
              > Thanks Bogdan, for a useful tool!
              >
              > Monty
              >
              > > -----Original Message-----
              > > From: Bogdan Mitu [mailto:bogdan_mt@...]
              > > Sent: Monday, January 07, 2002 7:11 AM
              > > To: antlr-interest@yahoogroups.com
              > > Cc: parrt@...
              > > Subject: [antlr-interest] Tool for testing lexers/parsers/tree parsers
              > >
              > >
              > > Hi all,
              > >
              > > I worked on a project using ANTLR default ASTs (speed was not
              > > a concern),
              > > and I wrote a small utility to help me test the tree parsers. Later I
              > > extended it to allow testing of parsers and lexers as well. I
              > > would like to
              > > contribute this utility to ANTLR; I put the files in the ANTLR group
              > > http://groups.yahoo.com/group/antlr-interest/files/tester.zip
              > >
              > > A test file consists of a header and a series of tests. The
              > > header contains
              > > at least the class name, the type (lexer, parser or tree
              > > parser) and the
              > > rule to be tested (for parsers/tree parsers). The tests are
              > > pairs input -
              > > expected output. The tester takes the input, feed the tool
              > > under test and
              > > compare the output with the expected one. Tests that fail are
              > > displayed in
              > > pop-up frames.
              > >
              > > To run the tool you have to type java test.TestRunner
              > > <filename>. In the
              > > root of the distribution you can find some .bat files for
              > > invoking a lexer,
              > > a parser and a tree parser. In the "example" folder there is a sample
              > > grammar bool.g and some sample tests (lexer.test, parser.test and
              > > treeParser.test). Note that in each file the last test is
              > > wrong, just to
              > > force frames to pop-up.
              > >
              > > If you decide to give it a try, please let me know your comments.
              > >
              > >
              > > Regards,
              > >
              > >
              > >
              > >
              > > __________________________________________________
              > > Do You Yahoo!?
              > > Send FREE video emails in Yahoo! Mail!
              > > http://promo.yahoo.com/videomail/
              > >
              > >
              > >
              > > Your use of Yahoo! Groups is subject to
              > > http://docs.yahoo.com/info/terms/
              > >
              > >
              > >
              >
              >
              >
              > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
              >
              >


              __________________________________________________
              Do You Yahoo!?
              Send FREE video emails in Yahoo! Mail!
              http://promo.yahoo.com/videomail/
            • mzukowski@bco.com
              ... Very clever. That avoids the need to rewrite the grammar for testing. ... Luckily, only for protected rules. If you keep to just testing your
              Message 6 of 6 , Jan 16, 2002
              • 0 Attachment
                > I will try to avoid rewriting tested grammars.
                > To make sure that all input has been parsed, at the end of a
                > test I should:
                > 1. Check that all tokens have been requested
                > 2. The token queue of the parser is empty (to be sure that
                > tokens were not
                > read only in guessing mode.
                >
                > There still remains a problem, of course, because some rules
                > NEED trailing
                > tokens for LA. I will add a keyword to distinguish tokens
                > that need to be
                > consumed from tokens only needed in LA.

                Very clever. That avoids the need to rewrite the grammar for testing.

                > Something similar needs to be done for lexers too.

                Luckily, only for protected rules. If you keep to just testing your
                non-protected rules then you are fine.

                Monty
              Your message has been successfully submitted and would be delivered to recipients shortly.