Loading ...
Sorry, an error occurred while loading the content.
 

Re: [antlr-interest] Digest Number 314

Expand Messages
  • Per Einarsson
    ... Ok. Thanks. Too bad, since I could really use this functionality. Limiting the outcome of a lexer lookahead to one token seems a bit odd, but I m no expert
    Message 1 of 3 , Mar 13, 2000
      On Mon, 13 Mar 2000, Michael Schmitt wrote:
      > Hmm. As far as I understand ANTLR, there is no elegant way to describe
      > this.

      Ok. Thanks. Too bad, since I could really use this functionality.
      Limiting the outcome of a lexer lookahead to one token seems a bit odd,
      but I'm no expert on this.

      BTW, according to the FAQ you can build a list of nodes with no root
      when building antlr ASTs. I just looked at it very briefly, and I
      think it's reminiscent of what I want to do, but I'm not using ASTs at the
      moment.

      > However, I think that it is not a good idea to distinguish between
      > IDs and assignments in a lexer. Normally, this is a task for a parser.
      > If you move the decision whether an expression is an identifier or an
      > assignment to the parser, your problems should not occur.

      Yes, I've been thinking of that as well. The problem with this is that I
      need to check the syntax of the different assignments. With the parser
      approach I can't have a nice lexer rule for each possible assignment and
      use this to check if the input is valid. Instead I would have to check
      each ID delivered by the lexer "by hand" in a semantic action or something
      in the parser. I would have to write my own methods like
      isValidFilename(), isValidDirectory() etc, and this seems silly since this
      is the type of work that should be done by the lexer.

      Unfortunately the lexer can't be told by the parser what to expect, so if
      I want the lexer to recognise invalid values in an assignment I have to
      use lookahead in order to let it know what type of value to check for (e g
      check for FILENAME after recognising "File =").

      Moreover, getting the text parsed by a parser rule is very awkward
      compared to getting it from a token, and the handling of whitespace and
      literals and keywords is much easier when using my current solution. But I
      am considering going for your suggestion anyway, since the current
      approach is giving me trouble.

      Thank you for your help.

      // Pelle Einarsson
    • Michael Schmitt
      ... Well, it may be possible that another tool (or even ANTLR) is able to provide this functionality but I have never heard of it. I think your approach is not
      Message 2 of 3 , Mar 13, 2000
        Per Einarsson wrote:

        > On Mon, 13 Mar 2000, Michael Schmitt wrote:
        > > Hmm. As far as I understand ANTLR, there is no elegant way to describe
        > > this.
        >
        > Ok. Thanks. Too bad, since I could really use this functionality.
        > Limiting the outcome of a lexer lookahead to one token seems a bit odd,
        > but I'm no expert on this.

        Well, it may be possible that another tool (or even ANTLR) is able to
        provide this functionality but I have never heard of it. I think your
        approach is not in line with classic compiler construction methodology.

        > BTW, according to the FAQ you can build a list of nodes with no root
        > when building antlr ASTs. I just looked at it very briefly, and I
        > think it's reminiscent of what I want to do, but I'm not using ASTs at the
        > moment.

        At the moment, ASTs cannot be constructed for lexers.

        > Yes, I've been thinking of that as well. The problem with this is that I
        > need to check the syntax of the different assignments. With the parser
        > approach I can't have a nice lexer rule for each possible assignment and
        > use this to check if the input is valid. Instead I would have to check
        > each ID delivered by the lexer "by hand" in a semantic action or something
        > in the parser. I would have to write my own methods like
        > isValidFilename(), isValidDirectory() etc, and this seems silly since this
        > is the type of work that should be done by the lexer.

        Normally, it should be possible to check the token type in the lexer.
        But I suggest not to do this by some context information (like: Is there
        a ":=" after that token?) Instead you should refer to a symbol table to
        determine the type.

        > Moreover, getting the text parsed by a parser rule is very awkward
        > compared to getting it from a token, and the handling of whitespace and
        > literals and keywords is much easier when using my current solution.

        There was a misunderstanding. Of course, the lexer should return tokens
        and the parser should parse the token stream. But the type of a token
        should not depend on its context (what's behind or in front).

        Hope that helps,

        Michael

        --
        ======================================================================
        Michael Schmitt phone: +49 451 500 3725
        Institute for Telematics secretary: +49 451 500 3721
        Medical University of Luebeck fax: +49 451 500 3722
        Ratzeburger Allee 160 eMail: schmitt@...-luebeck.de
        D-23538 Luebeck, Germany WWW: http://www.itm.mu-luebeck.de
        ======================================================================
      Your message has been successfully submitted and would be delivered to recipients shortly.