Loading ...
Sorry, an error occurred while loading the content.
 

bug in Lexer 2.7.1

Expand Messages
  • Stan Pinte
    hello, I have cam across what I would qualify as a bug: I have a ANTLR grammar, in which I have a well-known IDENT section, which looks like this: // an
    Message 1 of 5 , Apr 2 3:10 AM
      hello,

      I have cam across what I would qualify as a bug:

      I have a ANTLR grammar, in which I have a well-known IDENT section, which
      looks like this:

      // an identifier. Note that testLiterals is set to true! This means
      // that after we match the rule, we look in the literals table to see
      // if it's a literal or really an identifer
      IDENT
      options {testLiterals=true;}
      : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')*
      ;


      In one of my text files to be parsed, I have put an identifyer with a '_'
      symbol.

      Instead of giving me a nice

      "Error: line(x), expecting ID, found '_'"

      ANTLR crashes...

      has anyone seen that behaviour? Is there a patch available?

      thanks a lot,

      PS: I can reproduce the behaviour very consistently.

      Stan.
    • Sinan
      ... First it definitely sounds like a bug. However, is there anywhere in your lexer , a rule that specifies a zero or more characters ? (just a wild guess...)
      Message 2 of 5 , Apr 4 3:28 PM
        Stan Pinte wrote:
        >
        > hello,
        >
        > I have cam across what I would qualify as a bug:
        >
        > I have a ANTLR grammar, in which I have a well-known IDENT section, which
        > looks like this:
        >
        > // an identifier. Note that testLiterals is set to true! This means
        > // that after we match the rule, we look in the literals table to see
        > // if it's a literal or really an identifer
        > IDENT
        > options {testLiterals=true;}
        > : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')*
        > ;
        >
        > In one of my text files to be parsed, I have put an identifyer with a '_'
        > symbol.
        >
        > Instead of giving me a nice
        >
        > "Error: line(x), expecting ID, found '_'"
        >
        > ANTLR crashes...
        >


        First it definitely sounds like a bug. However,
        is there anywhere in your lexer , a rule that specifies
        a zero or more characters ? (just a wild guess...)

        Something like

        DIGIT: ('0'..'9')* ;

        or more insidiously:

        NUMBER: ( '+' | '-' )? ('0'..'9')* ;

        Sometimes a bug like you found can be gotten around by not triggering it
        until someone fixes it...

        Sinan
      • Stan Pinte
        ... ok, thanks. I will check my grammar. It post it enclosed. At first sight, I couldn t see anything which would result in zero characters... thanks a lot for
        Message 3 of 5 , Apr 5 1:05 AM
          At 15:28 04/04/2001 -0700, you wrote:
          >Stan Pinte wrote:
          > >
          > > hello,
          > >
          > > I have cam across what I would qualify as a bug:
          > >
          > > I have a ANTLR grammar, in which I have a well-known IDENT section, which
          > > looks like this:
          > >
          > > // an identifier. Note that testLiterals is set to true! This means
          > > // that after we match the rule, we look in the literals table to see
          > > // if it's a literal or really an identifer
          > > IDENT
          > > options {testLiterals=true;}
          > > : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')*
          > > ;
          > >
          > > In one of my text files to be parsed, I have put an identifyer with a '_'
          > > symbol.
          > >
          > > Instead of giving me a nice
          > >
          > > "Error: line(x), expecting ID, found '_'"
          > >
          > > ANTLR crashes...
          > >
          >
          >
          >First it definitely sounds like a bug. However,
          >is there anywhere in your lexer , a rule that specifies
          >a zero or more characters ? (just a wild guess...)
          >
          >Something like
          >
          >DIGIT: ('0'..'9')* ;
          >
          >or more insidiously:
          >
          >NUMBER: ( '+' | '-' )? ('0'..'9')* ;
          >
          >Sometimes a bug like you found can be gotten around by not triggering it
          >until someone fixes it...
          >
          >Sinan


          ok, thanks. I will check my grammar. It post it enclosed.

          At first sight, I couldn't see anything which would result in zero
          characters...

          thanks a lot for helping.

          Stan.


          >
          >
          >Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
        • Ric Klaren
          Hi, ... I had a quick look at your grammar file. Had to make the following changes to get it to work: - Removed importvocab - as mentioned removed _ from
          Message 4 of 5 , Apr 5 2:02 AM
            Hi,

            On Thu, Apr 05, 2001 at 10:05:31AM +0200, Stan Pinte wrote:
            > > > // an identifier. Note that testLiterals is set to true! This means
            > > > // that after we match the rule, we look in the literals table to see
            > > > // if it's a literal or really an identifer
            > > > IDENT
            > > > options {testLiterals=true;}
            > > > : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')*
            > > > ;
            > > >
            > > > In one of my text files to be parsed, I have put an identifyer with a '_'
            > > > symbol.
            > > >
            > > > Instead of giving me a nice
            > > >
            > > > "Error: line(x), expecting ID, found '_'"
            > > >
            > > > ANTLR crashes...

            I had a quick look at your grammar file. Had to make the following changes
            to get it to work:
            - Removed importvocab
            - as mentioned removed '_' from IDENT rule.
            - Added a main to the lexer.

            The lexer itself seems to behave ok. It throws an exception
            (NoViableAltForCharException) on the offending '_'. Later in the afternoon
            I'll glue the parser to it and see if it goes wrong there.

            BTW What compiler are you using?

            Cheers,

            Ric
            --
            -----+++++*****************************************************+++++++++-------
            ---- Ric Klaren ----- klaren@... ----- +31 53 4893722 ----
            -----+++++*****************************************************+++++++++-------
            Why don't we just invite them to dinner and massacre them all when they're
            drunk? You heard the man. There's seven hundred thousand of them.
            Ah? ... So it'd have to be something simple with pasta, then.
            --- From: Interesting Times by Terry Pratchet
            -----+++++*****************************************************+++++++++-------
          • Stan Pinte
            ... MVS 6, with sp5, static version of ANTLR.
            Message 5 of 5 , Apr 5 2:14 AM
              At 11:02 05/04/2001 +0200, you wrote:
              >Hi,
              >
              >On Thu, Apr 05, 2001 at 10:05:31AM +0200, Stan Pinte wrote:
              > > > > // an identifier. Note that testLiterals is set to true! This means
              > > > > // that after we match the rule, we look in the literals table to see
              > > > > // if it's a literal or really an identifer
              > > > > IDENT
              > > > > options {testLiterals=true;}
              > > > > : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9')*
              > > > > ;
              > > > >
              > > > > In one of my text files to be parsed, I have put an identifyer with
              > a '_'
              > > > > symbol.
              > > > >
              > > > > Instead of giving me a nice
              > > > >
              > > > > "Error: line(x), expecting ID, found '_'"
              > > > >
              > > > > ANTLR crashes...
              >
              >I had a quick look at your grammar file. Had to make the following changes
              >to get it to work:
              >- Removed importvocab
              >- as mentioned removed '_' from IDENT rule.
              >- Added a main to the lexer.
              >
              >The lexer itself seems to behave ok. It throws an exception
              >(NoViableAltForCharException) on the offending '_'. Later in the afternoon
              >I'll glue the parser to it and see if it goes wrong there.
              >
              >BTW What compiler are you using?


              MVS 6, with sp5, static version of ANTLR.


              >Cheers,
              >
              >Ric
              >--
              >-----+++++*****************************************************+++++++++-------
              > ---- Ric Klaren ----- klaren@... ----- +31 53 4893722 ----
              >-----+++++*****************************************************+++++++++-------
              > Why don't we just invite them to dinner and massacre them all when they're
              > drunk? You heard the man. There's seven hundred thousand of them.
              > Ah? ... So it'd have to be something simple with pasta, then.
              > --- From: Interesting Times by Terry Pratchet
              >-----+++++*****************************************************+++++++++-------
              >
              >
              >
              >Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
            Your message has been successfully submitted and would be delivered to recipients shortly.