Loading ...
Sorry, an error occurred while loading the content.

Re: Tokens and context

Expand Messages
  • lgcraymer
    One minor addendum to Monty s comments: also look at the qualifiedID rule in antlr.g. The one problem with the qualifiedID type of solution is that it
    Message 1 of 4 , Jul 15, 2002
    • 0 Attachment
      One minor addendum to Monty's comments: also look at the
      "qualifiedID" rule in antlr.g.

      The one problem with the "qualifiedID" type of solution is that it
      ignores whitespace (a.b.c.d and a . b.c.d are not the same). For your
      problem, you may prefer to have a "STAR_IDENTIFIER" in addition to
      STAR and IDENTIFIER: "*" followed by whitespace is a "STAR", while
      "*"(<identifier>)? or <identifier>("*"(<identifier2>)?)+ is a
      STAR_IDENTIFIER. ANTLR lexers can make the distinction, and probably
      only two lexer rules are affected.

      --Loring
    • John Lam
      Thanks so much for sending this link along. That chapter rocks (I loved the maze metaphor). For the time being, I m going to stick with using any as a token
      Message 2 of 4 , Jul 16, 2002
      • 0 Attachment
        Thanks so much for sending this link along. That chapter rocks (I loved the maze metaphor). For the time being, I'm going to stick with using "any" as a token (I'm quite flexible in the design of my language).

        If I have cycles later to revisit this problem I'll see if I can figure out how to deal with "*" in a cleaner fashion. Using a TokenStreamFilter just might be the ticket as well; I can replace earlier "*"''s with the correct tokens.

        Cheers,
        -John
        http://www.iunknown.com


        -----Original Message-----
        From: mzukowski@... [mailto:mzukowski@...]
        Sent: Mon 7/15/2002 5:16 PM
        To: antlr-interest@yahoogroups.com
        Cc:
        Subject: RE: [antlr-interest] Tokens and context



        As with everything else in antlr, think about how you would do this by hand,
        especially after reading Ter's chapter on "Building Translators by Hand"
        http://www.antlr.org/book/index.html.

        Antlr doesn't have a good way of wildcarding things like that. It can deal
        with ambiguous keywords but it takes some work, mostly with syntactic
        predicates. See http://www.jguru.com/faq/view.jsp?EID=140.

        It will mostly come down to dealing with the ambiguities. '*' can't morph
        into the appropriate token by context, antlr just can't deal with it like
        that. But you can put in syntactic predicates to elimiate the ambiguities.
        You'll have to hoist them by hand though.

        If you have enough context you could write a filter that changes '*' into
        the right thing via a TokenStreamFilter. See
        www.codetransform.com/filterexample.html.

        How complete is your language and what kind of performance do you need? For
        small examples that will be typed into a search panel as given you could
        iterate through parsing with '*' as identifier, modifier, expression, etc.

        Monty
        www.codetransform.com

        > -----Original Message-----
        > From: John Lam [mailto:jlam@...]
        > Sent: Monday, July 15, 2002 12:16 PM
        > To: Antlr-interest@yahoogroups.com
        > Subject: [antlr-interest] Tokens and context
        >
        >
        > I've been scratching my head over this one. Here's the
        > general problem:
        > I want to be able to use "*" as a wildcarding character. The
        > problem is
        > that it can be treated as either a token or as part of an identifier.
        > Consider:
        >
        > public void System.Foo();
        >
        > I would also like to find this method using the following expressions:
        >
        > [1] * void System.Foo();
        > [2] public void *();
        > [3] public void System.*();
        > [4] public void S*.*();
        >
        > The problem is that * is treated as a token in [1] and as part of the
        > identifier in cases 2-4.
        >
        > Is there a general solution to this problem? Currently I'm using a
        > really crufty hack that involves creating a new token ("any") to
        > represent * in the accessibility modifiers.
        >
        > Thanks!
        >
        > -John
        > http://www.iunknown.com
        >
        >
        >
        > Your use of Yahoo! Groups is subject to
        > http://docs.yahoo.com/info/terms/
        >
        >



        Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      Your message has been successfully submitted and would be delivered to recipients shortly.