Loading ...
Sorry, an error occurred while loading the content.

6365Literal testing in ANTLR

Expand Messages
  • Jamieson M. Cobleigh
    Oct 2, 2002
    • 0 Attachment
      I was working on a ANTLR Lexer with the following rules:

      options {
      testLiterals = false;
      charVocabulary = '\3'..'\377';
      }

      tokens {
      DIGRAPH = "digraph";
      }

      STRING
      : QUOTED_STRING | UNQUOTED_STRING;

      protected
      QUOTED_STRING
      : '"'! (ESC | ~('"' | '\\'))* '"'!;

      protected
      UNQUOTED_STRING options { testLiterals=true; }
      : LETTER ( NUMBER | LETTER | '_')*;

      ESC goes to the usual collection of \n, \r, etc.


      When 'digraph' was encountered during lexing, it was getting lexed by the
      UNQUOTED_STRING rule and then tested against the literal table. However,
      the STRING rule was setting the token type to be STRING, overwriting the
      result of the literal test.

      I moved the testLiterals=true option to the STRING rule, but then both
      'digraph' and '"digraph"' were getting matched as the literal DIGRAPH
      because the QUOTED_STRING rule removed the double quotes.

      My solution was to remove the STRING rule and make a rule in my parser:
      aString : QUOTED_STRING | UNQUOTED_STRING;

      Is this the best way to do this or is there a better solution that I'm not
      seeing?

      Jamie