Loading ...
Sorry, an error occurred while loading the content.

694RE: The problem with literals.

Expand Messages
  • mzukowski@bco.com
    Sep 1, 1999
    • 0 Attachment
      Where do you need the parser? You can't have the parser tell the lexer when
      to look for literals and when not to because the parser can be looking ahead
      by any number of tokens. In the simple case you show, I would put your
      invite and response rules directly in the lexer:


      options {testLiterals = true;}

      protected STRING_NO_LITERALS:
      ('a'..'z' | 'A'..'Z' | '1'..'9')+

      For more complex cases that you must have in the parser, you will have to
      make a rule which has all of your literals in it. The problem you will run
      into with that is lots of possible ambiguity warnings. I wrote a BASIC
      grammar where there were no reserved keywords, and that was the solution I
      had to use.

      Simply put, your ambiguity is in the fact that the same token can have two
      types depending on when it occurs. If you can isolate those occurances to
      simple rules such as the above, then those rules can easily live in the
      lexer, where you haven't finalized the type of a token yet.

      Another solution is to make everything a string and then the parser is
      responsible for testing for the type using semantic predicates. This would
      be slower and would mean that you would manually have to factor the rules
      which have common prefixes of type STRING that you are testing. Sounds
      ugly, do it in the lexer.

      Please post a more difficult example if you have one.


      > -----Original Message-----
      > From: Jayanthi Rao [mailto:jrao88@...]
      > Sent: Wednesday, September 01, 1999 10:42 AM
      > To: antlr-interest@onelist.com
      > Subject: [antlr-interest] The problem with literals.
      > From: Jayanthi Rao <jrao88@...>
      > Hi all,
      > Been having trouble with literals lately and I am sure that a tool as
      > cool
      > as ANTLR should definitely have
      > a way out. Here goes -
      > Say we have a grammar in which the contents and the
      > identifiers are both
      > alphanumeric strings. Say we have two
      > types of messages, "INVITE:s and "RESPONSE:"s. For simplicit let the
      > message
      > contents be just single alphanumeric
      > tokens. In this case, the grammar described above comes out as:
      > invite:
      > response:
      > where,
      > COLON:
      > ':'
      > ;
      > STRING:
      > ('a'..'z' | 'A'..'Z' | '1'..'9')+
      > ;
      > If the input is "INVITE:foo", the first token is recognized
      > as a STRING,
      > then qualifies thru the literals table as
      > an INVITE, next COLON is recognized and finally STRING.
      > However, if there is an input string of the form "INVITE:xyz"
      > where xyz
      > is
      > ANY of the literals defined in the
      > grammar, then even if we are expecting a STRING, we have an error,
      > because
      > the Lexer sees this as a
      > literal, which in this context it is not, it is just another string.
      > The only way out that I could think of is something like
      > invite:
      > "INVITE" COLON (STRING | "INVITE" ...)
      > The only problem here is that in my actual grammar there are more
      > complex
      > lexical tokens than just STRINGS
      > and I have about 30 literals. Is there any way to switch of
      > the literals
      > table testing in parts of the parser, where
      > we no that literals have no role. Do notice that use of the
      > testLiterals
      > option does not help in this case.
      > Regards,
      > Jagdeep Rao
      > --------------------------- ONElist Sponsor
      > ----------------------------
      > ONElist users: YOU can win a $100 gift certificate to Amazon.com.
      > Check out the FRIENDS & FAMILY program to find out how.
      > <a href=" http://clickme.onelist.com/ad/Teaser112 ">Click Here</a>
      > --------------------------------------------------------------
      > ----------
    • Show all 2 messages in this topic