Loading ...
Sorry, an error occurred while loading the content.
 

Literal testing in ANTLR

Expand Messages
  • Jamieson M. Cobleigh
    I was working on a ANTLR Lexer with the following rules: options { testLiterals = false; charVocabulary = 3 .. 377 ; } tokens { DIGRAPH = digraph ; }
    Message 1 of 1 , Oct 2, 2002
      I was working on a ANTLR Lexer with the following rules:

      options {
      testLiterals = false;
      charVocabulary = '\3'..'\377';
      }

      tokens {
      DIGRAPH = "digraph";
      }

      STRING
      : QUOTED_STRING | UNQUOTED_STRING;

      protected
      QUOTED_STRING
      : '"'! (ESC | ~('"' | '\\'))* '"'!;

      protected
      UNQUOTED_STRING options { testLiterals=true; }
      : LETTER ( NUMBER | LETTER | '_')*;

      ESC goes to the usual collection of \n, \r, etc.


      When 'digraph' was encountered during lexing, it was getting lexed by the
      UNQUOTED_STRING rule and then tested against the literal table. However,
      the STRING rule was setting the token type to be STRING, overwriting the
      result of the literal test.

      I moved the testLiterals=true option to the STRING rule, but then both
      'digraph' and '"digraph"' were getting matched as the literal DIGRAPH
      because the QUOTED_STRING rule removed the double quotes.

      My solution was to remove the STRING rule and make a rule in my parser:
      aString : QUOTED_STRING | UNQUOTED_STRING;

      Is this the best way to do this or is there a better solution that I'm not
      seeing?

      Jamie
    Your message has been successfully submitted and would be delivered to recipients shortly.