3937Re: RFC: defaulterrorhandler [WAS: Re: how do i skip unmatched characters?]
- Jul 3, 2001Tuesday, July 03, 2001, Ric Klaren hath spoken:
> Hi,Unless it's a bug, we should discuss changes to behavior I think.
> People should the default errorhandler be on in lexers ? (like it is with
> all the other parsers?)
> If I'm not receiving counterarguments I'm gonna change it. So it's
> consistent with the rest of the behaviour of the tool.
> (or am I jumping the gun =) )
Anyway...I'm can't remember my reasons and I'm foggy at the moment,
but lexers are different in the sense that you don't want the errors
to be trapped in the rules I think--all output of the lexer goes thru
the nextToken method. If an error is trapped in a rule, it will return
with bogus information and most importantly w/o knowledge that an
error occurred. nextToken will return bogus tokens to the parser.
Unless the lexer is very complicated, it's usually ok to just say
"this text 'xxx' is bogus on line n."
Note that I specifically turn ON default handling often in protected
rules (note these are not invoked directly by the nextToken method,
hence, avoiding the abovementioned problem). In these rules, such as
the args for an HTML tag, I often want to say "bogus image tag
argument on line n" and keep going.
So, when I want to detect errors WITHIN a token and keep going to
return some valid token to the parser (fault tolerance) I use the
default handlers or specify one for a protected rule. Ok, i've
convinced myself that the current behavior is appropriate. Somebody
could convince me though that they should be on for protected rules by
default, but these rules are already confusing enough for people ;)
Chief Scientist & Co-founder, http://www.jguru.com
Co-founder, http://www.NoWebPatents.org -- Stop Patent Stupidity
- << Previous post in topic Next post in topic >>