Loading ...
Sorry, an error occurred while loading the content.

3951Re: RFC: defaulterrorhandler [WAS: Re: how do i skip unmatched characters?]

Expand Messages
  • Ric Klaren
    Jul 6 3:42 AM
    • 0 Attachment
      Hi,

      Finaly found the time to answer this one with thinking ...

      On Tue, Jul 03, 2001 at 11:08:50AM -0700, Terence Parr wrote:
      > > (or am I jumping the gun =) )

      Yup!

      > Unless it's a bug, we should discuss changes to behavior I think.

      Ack that's why I did this RFC thing =)

      > Anyway...I'm can't remember my reasons and I'm foggy at the moment, but
      > lexers are different in the sense that you don't want the errors to be
      > trapped in the rules I think--all output of the lexer goes thru the
      > nextToken method.

      Yup.

      > If an error is trapped in a rule, it will return with bogus information and
      > most importantly w/o knowledge that an error occurred. nextToken will
      > return bogus tokens to the parser. Unless the lexer is very complicated,
      > it's usually ok to just say "this text 'xxx' is bogus on line n."

      Aha.

      > So, when I want to detect errors WITHIN a token and keep going to return
      > some valid token to the parser (fault tolerance) I use the default handlers
      > or specify one for a protected rule.

      Only problem is that you can't specify a errorhandler for the nextToken
      rule... So if you want unexpected char's reported inside your lexer without
      going back to the parser (which is not practical in some cases). You
      a) have to specify defaultErrorhandler = true; and maybe in lot's of other
      places defaultErrorHandler = false; (AFAIK only way to get
      defaulterrorhandler in just the nextToken rule)
      b) use the filter rule 'hack' which is IMHO not the most intuitive way to deal
      with these things. (faq's on this topic are shortish)

      > Ok, i've convinced myself that the current behavior is appropriate.

      Me as well =) but with the above notes.

      I guess we should do a few documentation fixes with respect to this. Maybe
      add a section on skipping/reporting on unrecognized chars in the lexer.

      I've been thinking in extending the grammar to allow a:

      class MyParser extends Parser;
      options {
      ...
      }
      exception catch [ ... ] { .. }

      Syntax for at least (tree)parsers so you can specify a different
      defaultErrorhandler for all rules (this should work nicely together with
      Ernest's $lookaheadSet patch).

      For a lexer we could then modify the behaviour to change the errorhandler
      for nextToken?

      Any thoughts?

      Ric
      --
      -----+++++*****************************************************+++++++++-------
      ---- Ric Klaren ----- klaren@... ----- +31 53 4893722 ----
      -----+++++*****************************************************+++++++++-------
      Why don't we just invite them to dinner and massacre them all when they're
      drunk? You heard the man. There's seven hundred thousand of them.
      Ah? ... So it'd have to be something simple with pasta, then.
      --- From: Interesting Times by Terry Pratchet
      -----+++++*****************************************************+++++++++-------
    • Show all 19 messages in this topic