Loading ...
Sorry, an error occurred while loading the content.

3922Re: [antlr-interest] how do i skip unmatched characters?

Expand Messages
  • Stdiobe
    Jul 2, 2001
    • 0 Attachment
      Matthew,

      thanks for your response!

      If I understand your solution correctly, you catch the exception when
      the parser exits and restart the parser until you get correct input.

      That wouldn't work in my case, because my parser expects a "complete"
      program to parse, so the lexer should NOT return an unexpected character
      exception to the parser, but instead should capture it itself.

      ----- Original Message -----
      From: Matthew Ford <Matthew.Ford@...>
      To: <antlr-interest@yahoogroups.com>
      Sent: Saturday, June 30, 2001 1:05 AM
      Subject: Re: [antlr-interest] how do i skip unmatched characters?


      > I had this problem when handling commands comming in via telnet
      > I think this should be a common requirement. There should be an FAQ about
      > it.
      > matthew
      >
      > This is what I did to skip to the next ; and then continue parsing
      > (ErrorLog and Debug are my own reporting logs)
      >
      >
      > In the Parser
      >
      > options {
      > k = 1; // one token lookahead
      > defaultErrorHandler = false; // Don't generate parser error handlers
      > buildAST = true;
      > importVocab = TIMEFRAMECOMMONLEXER;
      > exportVocab = TIMEFRAMESERVERPARSER;
      >
      > }
      >
      >
      > {
      >
      > ICommandGeneratorServer commandGenerator;
      > TimeFrameCommonLexer TFlexer;
      > static String nl = System.getProperty("line.separator","\n");
      >
      > final static int MAJOR = 401;
      > final static int TRANSLATION_ERROR = 1;
      > final static int FIELD_QUALIFIER_ERROR = 2;
      > final static int QUERY_NOT_DEFINED = 3;
      > final static int NO_AGE_INDEX = 4;
      >
      > public TimeFrameCommandsParser(TimeFrameCommonLexer lexer,
      > ICommandGeneratorServer commandGenerator, IScope iScope) {
      > this(lexer);
      > TFlexer = lexer;
      > this.commandGenerator = commandGenerator;
      > this.iScope = iScope;
      > }
      >
      > public void reportError(ANTLRException ex) {
      > Debug.out("in reportError", MAJOR, 0);
      >
      > commandGenerator.translationError(TFlexer.getLineBuffer(),
      >
      >
      0,0,null,null,TimeFrameException.makeTimeFrameException(MAJOR,TRANSLATION_ER
      > ROR,ex.getMessage()));
      > Debug.out("******* end reportError *******", MAJOR, 0);
      > }
      >
      >
      >
      > public void processError(ANTLRException ex) throws TokenStreamException,
      > CharStreamException {
      > // actually only throws TokenStreamIOException others caught here
      > int tokenType=0;
      > LexerSharedInputState inputState = TFlexer.getInputState();
      > inputState.guessing = 0; // clear guessing mode
      > Debug.out("in processError", MAJOR, 0);
      > if (!errorFlag) { // first error
      > reportError(ex);
      > errorFlag=true; // block new errors until after syncing.
      > }
      >
      > do {
      > try {
      > if (ex instanceof TokenStreamRecognitionException) {
      > TokenStreamRecognitionException rex =
      > (TokenStreamRecognitionException)ex;
      > // get underlying exception
      > ex = null; // have handled this one now
      > if ((rex.recog instanceof MismatchedCharException) ||
      > (rex.recog instanceof NoViableAltForCharException)) {
      > try {
      > TFlexer.consume(); // remove current error char;
      > } catch (CharStreamException cse) {
      > if ( cse instanceof CharStreamIOException ) {
      > throw new TokenStreamIOException(((CharStreamIOException)cse).io);
      > } else {
      > throw new TokenStreamIOException(new
      IOException(cse.getMessage()));
      > }
      > }
      > }
      > }
      >
      > tokenType = LA(1);
      > if ((tokenType != EOF) && (tokenType != SEMI)) {
      > consume(); // remove ;
      > Debug.out("Input buffer:'"+TFlexer.getLineBuffer()+"'", MAJOR,
      0);
      > }
      >
      > } catch (TokenStreamRecognitionException ex1) {
      > ex = ex1; // and loop
      > // TFlexer.consume(); // remove current error char;
      > Debug.out("** found :"+ ex1, MAJOR, 0);
      > } catch (TokenStreamRetryException ex1) {
      > Debug.out("** found :"+ ex1, MAJOR, 0);
      > throw new TokenStreamIOException(new IOException(ex1.getMessage()));
      > }
      > } while ( tokenType != SEMI && tokenType != EOF && !isEOF());
      > Debug.out("** end processError *******", MAJOR, 0);
      > // if telnet print prompt again (How??)
      >
      > }
      >
      > private boolean errorFlag = false;
      >
      > private boolean eofFlag = false;
      >
      > public boolean isEOF() {
      > return eofFlag;
      > }
      >
      > private void clearErrorFlag() {
      > errorFlag = false;
      > }
      >
      >
      >
      > After the SEMI is seen I expect to find a new statement
      > if I do then after I find a valid statement I call clearErrorFlag()
      >
      > // SetAttribute("attribute1","attributeValue");
      > setattribute!
      > : SETATTRIBUTE
      > LPAREN attr:STRING_LITERAL COMMA value:STRING_LITERAL RPAREN SEMI
      > { clearErrorFlagAndScope();
      >
      commandGenerator.setAttribute(TFlexer.getLineBuffer(),attr.getText(),value.g
      > etText());
      > }
      > ;
      >
      >
      >
      > Finally to tie it all together
      >
      > In the main program I start the parser like this (in its own thread)
      > The command the parser finds are put on a command stack to be handled by
      > another thread. This lets you issue cancel commands at any time.
      >
      > /**
      > * The method reads in commands one by one.
      > */
      > public void run() {
      > Debug.out(""+GlobalData.nl+" ---------- InputThread " + connectionNo + "
      > starts.",MAJOR,0);
      >
      > try {
      > do {
      > try {
      > Debug.out("InputThread Call Parser",MAJOR,0);
      > parser.program();
      > } catch (RecognitionException ex) {
      > ErrorLog.log.println("RecognitionException: "+ ex.getMessage());
      > Debug.out("InputThread RecognitionException: "+
      > ex.getMessage(),MAJOR,0);
      > parser.processError(ex);
      > } catch (TokenStreamRecognitionException ex) {
      > ErrorLog.log.println("TokenStreamRecognitionException: " +
      > ex.getMessage());
      > Debug.out("InputThread TokenStreamRecognitionException: " +
      > ex.getMessage(),MAJOR,0);
      > parser.processError(ex);
      > } catch (TokenStreamRetryException ex) {
      > ErrorLog.log.println("TokenStreamRetryException: " +
      ex.getMessage());
      > Debug.out("InputThread TokenStreamRetryException: " +
      > ex.getMessage(),MAJOR,0);
      > parser.processError(ex);
      > } catch(TokenStreamIOException ex) {
      > Debug.out("InputThread TokenStreamIOException: " +
      > ex.getMessage(),MAJOR,0);
      > if (getStopped()) {
      > break;
      > }
      > }
      > if (parser.isEOF()) {
      > Debug.out("parser found EOF *****************",MAJOR,0);
      > break; // do not call the program again
      > }
      > } while (!getStopped()); // was while true
      >
      > }
      > catch(Exception e) { file://TokenStream IO exceptions or
      CharStreamExceptions
      > Debug.out(ExceptionBuffer.getStackTrace(e), Debug.STACKTRACE, 0);
      > // Close stream on IO errors
      > if (e instanceof antlr.TokenStreamIOException) {
      > Debug.out("TokenStreamIOException: one connection is lost",MAJOR,0);
      > listener.cancelConnection();
      > }
      > else {
      > Debug.out("other exception:",MAJOR,0);
      > TimeFrameException tfe =
      > TimeFrameException.makeTimeFrameException(MAJOR,EXCEPTION,e.toString() +
      > e.getMessage());
      > Errors error = new Errors(tfe);
      > beaconit.serverutils.CommandLog.log.printCommand(sessionNo,
      > connectionNo, error.toLogText());
      > ErrorLog.log.printError(sessionNo, connectionNo, error, "");
      > stack.add(error);
      > }
      > }
      >
      >
      > This is how I stop the input thread
      > /**
      > * This method is used to set the stop flag.
      > */
      > synchronized public void stopThread() {
      > Debug.out("stopThread is called in InputThread " +
      connectionNo,MAJOR,0);
      > stopped = true;
      > }
      >
      > /**
      > * Synchronized method returns the status of stop flag.
      > * @return the status of stop flag.
      > */
      > synchronized public boolean getStopped() {
      > return stopped;
      > }
      >
      > ----- Original Message -----
      > From: "Stdiobe" <stdiobe@...>
      > To: <antlr-interest@yahoogroups.com>
      > Sent: Saturday, June 30, 2001 4:47 AM
      > Subject: [antlr-interest] how do i skip unmatched characters?
      >
      >
      > >
      > > Hi,
      > >
      > > when the lexer generated by ANTLR encounters an unmatched character,
      > > it throws a TokenStreamRecognitionException which causes my lexer
      > > to exit (and also my parser).
      > >
      > > Does anyone know how I can skip unmatched characters in the lexer
      > > by reporting the error to the user (with linenumber, etc.), and have
      > > the lexer continue scanning for valid tokens.
      > >
      > > Stdiobe.
      > >
      > >
      > >
      > >
      > >
      > >
      > > Your use of Yahoo! Groups is subject to
      http://docs.yahoo.com/info/terms/
      > >
      > >
      >
      >
      >
      >
      > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      >
      >
    • Show all 19 messages in this topic