Loading ...
Sorry, an error occurred while loading the content.
 

Re: [antlr-interest] Why No Error?

Expand Messages
  • Bogdan Mitu
    Hi, If you want to be sure that all the input has been parsed, you should finish the main rule with EOF: file : (line)+ EOF ; As a side note, the way you
    Message 1 of 10 , Aug 15, 2002
      Hi,

      If you want to be sure that all the input has been parsed, you should finish
      the main rule with EOF:

      file : (line)+ EOF ;

      As a side note, the way you defined the grammar, Comma between records is
      optional. If you want Comma to be mandatory between records, try:

      line : rec (COMMA rec)* NEWLINE ;
      rec : r:RECORD { action ... }

      Cheers,
      Bogdan

      --- genericised <trigonometric@...> wrote:
      > I created the following parser, as an example of how to
      > parse comma separated variable (CSV) files:
      >
      > class CSVParser extends Parser;
      > file : (line)+ ;
      > line : (rec)+ NEWLINE ;
      > rec : (r:RECORD) (COMMA)?
      > {System.out.println(r.getText());}
      > ;
      >
      > The corresponding Lexer is:
      >
      > class CSVLexer extends Lexer;
      > options { charVocabulary='\3'..'\377'; }
      > RECORD : (~(','|'\r'|'\n'|' '|'\t'))+ ;
      > COMMA : ',' ;
      > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
      > | '\r' //MAC
      > | '\n' //UNIX
      > { newline(); }
      > ;
      > WS : (' '|'\t') { $setType(Token.SKIP); } ;
      >
      > Pretty straightforward, but, when I run this on a
      > CSV it produces no error.
      >
      > The last line of a CSV is:
      >
      > blah, blah, blah
      >
      > so the line does not consist of
      >
      > rec+ NEWLINE
      >
      > but
      >
      > rec+
      >
      > When
      >
      > match(NEWLINE)
      >
      > is called from the parser, why does it not throw
      > a mismatchedTokenException?
      >
      > Or does it throw some kind of exception that is
      > caught and causes the parsing of the inputstream
      > to terminate gracefully?
      >
      > The parser is invoked from some main file like this:
      >
      > csvParser.file();
      >
      > I have spent a couple of hours investigating this,
      > looking through the ANTLR source and stuff but I
      > have not yet found where this is dealt with?
      >
      > I might do a bit of weekend investigation into this
      > because of what I will learn in the process of
      > determining this but at the moment I am supposed to
      > be writing this ANTLR tutorial and then got side
      > tracked trying to explain why it is OK that the
      > parser does not match the final NEWLINE.
      >
      > Well actually, is it ok, or should the rule for file
      > be defined something like:
      >
      > file : (line)+ EOFCHAR;
      >
      > Regards
      >
      > A Person
      >
      >
      >
      >
      > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      >
      >
      >


      __________________________________________________
      Do You Yahoo!?
      HotJobs - Search Thousands of New Jobs
      http://www.hotjobs.com
    • genericised
      oh didn t realise it was so easy, and I wanted comma to be optional, checkout my latest post however, it is a bit more tricky, hehe ;) ... should finish ...
      Message 2 of 10 , Aug 15, 2002
        oh didn't realise it was so easy, and I wanted
        comma to be optional, checkout my latest post however,
        it is a bit more tricky, hehe ;)

        --- In antlr-interest@y..., Bogdan Mitu <bogdan_mt@y...> wrote:
        > Hi,
        >
        > If you want to be sure that all the input has been parsed, you
        should finish
        > the main rule with EOF:
        >
        > file : (line)+ EOF ;
        >
        > As a side note, the way you defined the grammar, Comma between
        records is
        > optional. If you want Comma to be mandatory between records, try:
        >
        > line : rec (COMMA rec)* NEWLINE ;
        > rec : r:RECORD { action ... }
        >
        > Cheers,
        > Bogdan
        >
        > --- genericised <trigonometric@s...> wrote:
        > > I created the following parser, as an example of how to
        > > parse comma separated variable (CSV) files:
        > >
        > > class CSVParser extends Parser;
        > > file : (line)+ ;
        > > line : (rec)+ NEWLINE ;
        > > rec : (r:RECORD) (COMMA)?
        > > {System.out.println(r.getText());}
        > > ;
        > >
        > > The corresponding Lexer is:
        > >
        > > class CSVLexer extends Lexer;
        > > options { charVocabulary='\3'..'\377'; }
        > > RECORD : (~(','|'\r'|'\n'|' '|'\t'))+ ;
        > > COMMA : ',' ;
        > > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
        > > | '\r' //MAC
        > > | '\n' //UNIX
        > > { newline(); }
        > > ;
        > > WS : (' '|'\t') { $setType(Token.SKIP); } ;
        > >
        > > Pretty straightforward, but, when I run this on a
        > > CSV it produces no error.
        > >
        > > The last line of a CSV is:
        > >
        > > blah, blah, blah
        > >
        > > so the line does not consist of
        > >
        > > rec+ NEWLINE
        > >
        > > but
        > >
        > > rec+
        > >
        > > When
        > >
        > > match(NEWLINE)
        > >
        > > is called from the parser, why does it not throw
        > > a mismatchedTokenException?
        > >
        > > Or does it throw some kind of exception that is
        > > caught and causes the parsing of the inputstream
        > > to terminate gracefully?
        > >
        > > The parser is invoked from some main file like this:
        > >
        > > csvParser.file();
        > >
        > > I have spent a couple of hours investigating this,
        > > looking through the ANTLR source and stuff but I
        > > have not yet found where this is dealt with?
        > >
        > > I might do a bit of weekend investigation into this
        > > because of what I will learn in the process of
        > > determining this but at the moment I am supposed to
        > > be writing this ANTLR tutorial and then got side
        > > tracked trying to explain why it is OK that the
        > > parser does not match the final NEWLINE.
        > >
        > > Well actually, is it ok, or should the rule for file
        > > be defined something like:
        > >
        > > file : (line)+ EOFCHAR;
        > >
        > > Regards
        > >
        > > A Person
        > >
        > >
        > >
        > >
        > > Your use of Yahoo! Groups is subject to
        http://docs.yahoo.com/info/terms/
        > >
        > >
        > >
        >
        >
        > __________________________________________________
        > Do You Yahoo!?
        > HotJobs - Search Thousands of New Jobs
        > http://www.hotjobs.com
      • genericised
        Actually your solution is incorrect: file : (line)+ EOF ; would be wrong because a line would still expect a NEWLINE token at the end, the correct solution is:
        Message 3 of 10 , Aug 15, 2002
          Actually your solution is incorrect:

          file : (line)+ EOF ;

          would be wrong because a line would still expect a
          NEWLINE token at the end, the correct solution is:

          file : (line)+ ;
          line : (record)+ (NEWLINE|EOF) ;
          record : (r:RECORD) (COMMA)? ;

          well at least I think this is the correct solution, it
          looks like it is, and it is hard to think how something
          so simple could be wrong anyway. I am still interested
          in knowing why no error was generated in the original
          post however.

          --- In antlr-interest@y..., "genericised" <trigonometric@s...> wrote:
          > oh didn't realise it was so easy, and I wanted
          > comma to be optional, checkout my latest post however,
          > it is a bit more tricky, hehe ;)
          >
          > --- In antlr-interest@y..., Bogdan Mitu <bogdan_mt@y...> wrote:
          > > Hi,
          > >
          > > If you want to be sure that all the input has been parsed, you
          > should finish
          > > the main rule with EOF:
          > >
          > > file : (line)+ EOF ;
          > >
          > > As a side note, the way you defined the grammar, Comma between
          > records is
          > > optional. If you want Comma to be mandatory between records, try:
          > >
          > > line : rec (COMMA rec)* NEWLINE ;
          > > rec : r:RECORD { action ... }
          > >
          > > Cheers,
          > > Bogdan
          > >
          > > --- genericised <trigonometric@s...> wrote:
          > > > I created the following parser, as an example of how to
          > > > parse comma separated variable (CSV) files:
          > > >
          > > > class CSVParser extends Parser;
          > > > file : (line)+ ;
          > > > line : (rec)+ NEWLINE ;
          > > > rec : (r:RECORD) (COMMA)?
          > > > {System.out.println(r.getText());}
          > > > ;
          > > >
          > > > The corresponding Lexer is:
          > > >
          > > > class CSVLexer extends Lexer;
          > > > options { charVocabulary='\3'..'\377'; }
          > > > RECORD : (~(','|'\r'|'\n'|' '|'\t'))+ ;
          > > > COMMA : ',' ;
          > > > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
          > > > | '\r' //MAC
          > > > | '\n' //UNIX
          > > > { newline(); }
          > > > ;
          > > > WS : (' '|'\t') { $setType(Token.SKIP); } ;
          > > >
          > > > Pretty straightforward, but, when I run this on a
          > > > CSV it produces no error.
          > > >
          > > > The last line of a CSV is:
          > > >
          > > > blah, blah, blah
          > > >
          > > > so the line does not consist of
          > > >
          > > > rec+ NEWLINE
          > > >
          > > > but
          > > >
          > > > rec+
          > > >
          > > > When
          > > >
          > > > match(NEWLINE)
          > > >
          > > > is called from the parser, why does it not throw
          > > > a mismatchedTokenException?
          > > >
          > > > Or does it throw some kind of exception that is
          > > > caught and causes the parsing of the inputstream
          > > > to terminate gracefully?
          > > >
          > > > The parser is invoked from some main file like this:
          > > >
          > > > csvParser.file();
          > > >
          > > > I have spent a couple of hours investigating this,
          > > > looking through the ANTLR source and stuff but I
          > > > have not yet found where this is dealt with?
          > > >
          > > > I might do a bit of weekend investigation into this
          > > > because of what I will learn in the process of
          > > > determining this but at the moment I am supposed to
          > > > be writing this ANTLR tutorial and then got side
          > > > tracked trying to explain why it is OK that the
          > > > parser does not match the final NEWLINE.
          > > >
          > > > Well actually, is it ok, or should the rule for file
          > > > be defined something like:
          > > >
          > > > file : (line)+ EOFCHAR;
          > > >
          > > > Regards
          > > >
          > > > A Person
          > > >
          > > >
          > > >
          > > >
          > > > Your use of Yahoo! Groups is subject to
          > http://docs.yahoo.com/info/terms/
          > > >
          > > >
          > > >
          > >
          > >
          > > __________________________________________________
          > > Do You Yahoo!?
          > > HotJobs - Search Thousands of New Jobs
          > > http://www.hotjobs.com
        • Bogdan Mitu
          ... The parser does not necessary consume all input. One start a parser by calling one of its rules. In your case, file(). The parser call nextToken() on Lexer
          Message 4 of 10 , Aug 15, 2002
            >...I am still interested
            > in knowing why no error was generated in the original
            > post however.

            The parser does not necessary consume all input. One start a parser by
            calling one of its rules. In your case, file(). The parser call nextToken()
            on Lexer until the rule is finished, then stops. In your case, file()
            matches as many lines as it can, then the parser stops, although there
            follows still another (incorrect) line. This is correct behavior.

            If want to avoid this, put EOF at the end of the main rule. If you look to
            the examples in ANTLR distribution - java.g, tinyC.g etc. - you will see
            this.

            So try:

            file : line (NEWLINE line)* (NEWLINE)? EOF
            line : (record)+ ;
            record : (r:RECORD) (COMMA)? ;

            And take care that actually it's (NEWLINE)+ instead of NEWLINE, and
            (NEWLINE)* instead of (NEWLINE)? .

            --bogdan


            --- genericised <trigonometric@...> wrote:
            > Actually your solution is incorrect:
            >
            > file : (line)+ EOF ;
            >
            > would be wrong because a line would still expect a
            > NEWLINE token at the end, the correct solution is:
            >
            > file : (line)+ ;
            > line : (record)+ (NEWLINE|EOF) ;
            > record : (r:RECORD) (COMMA)? ;
            >
            > well at least I think this is the correct solution, it
            > looks like it is, and it is hard to think how something
            > so simple could be wrong anyway. I am still interested
            > in knowing why no error was generated in the original
            > post however.



            > --- In antlr-interest@y..., "genericised" <trigonometric@s...> wrote:
            > > oh didn't realise it was so easy, and I wanted
            > > comma to be optional, checkout my latest post however,
            > > it is a bit more tricky, hehe ;)
            > >
            > > --- In antlr-interest@y..., Bogdan Mitu <bogdan_mt@y...> wrote:
            > > > Hi,
            > > >
            > > > If you want to be sure that all the input has been parsed, you
            > > should finish
            > > > the main rule with EOF:
            > > >
            > > > file : (line)+ EOF ;
            > > >
            > > > As a side note, the way you defined the grammar, Comma between
            > > records is
            > > > optional. If you want Comma to be mandatory between records, try:
            > > >
            > > > line : rec (COMMA rec)* NEWLINE ;
            > > > rec : r:RECORD { action ... }
            > > >
            > > > Cheers,
            > > > Bogdan
            > > >
            > > > --- genericised <trigonometric@s...> wrote:
            > > > > I created the following parser, as an example of how to
            > > > > parse comma separated variable (CSV) files:
            > > > >
            > > > > class CSVParser extends Parser;
            > > > > file : (line)+ ;
            > > > > line : (rec)+ NEWLINE ;
            > > > > rec : (r:RECORD) (COMMA)?
            > > > > {System.out.println(r.getText());}
            > > > > ;
            > > > >
            > > > > The corresponding Lexer is:
            > > > >
            > > > > class CSVLexer extends Lexer;
            > > > > options { charVocabulary='\3'..'\377'; }
            > > > > RECORD : (~(','|'\r'|'\n'|' '|'\t'))+ ;
            > > > > COMMA : ',' ;
            > > > > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
            > > > > | '\r' //MAC
            > > > > | '\n' //UNIX
            > > > > { newline(); }
            > > > > ;
            > > > > WS : (' '|'\t') { $setType(Token.SKIP); } ;
            > > > >
            > > > > Pretty straightforward, but, when I run this on a
            > > > > CSV it produces no error.
            > > > >
            > > > > The last line of a CSV is:
            > > > >
            > > > > blah, blah, blah
            > > > >
            > > > > so the line does not consist of
            > > > >
            > > > > rec+ NEWLINE
            > > > >
            > > > > but
            > > > >
            > > > > rec+
            > > > >
            > > > > When
            > > > >
            > > > > match(NEWLINE)
            > > > >
            > > > > is called from the parser, why does it not throw
            > > > > a mismatchedTokenException?
            > > > >
            > > > > Or does it throw some kind of exception that is
            > > > > caught and causes the parsing of the inputstream
            > > > > to terminate gracefully?
            > > > >
            > > > > The parser is invoked from some main file like this:
            > > > >
            > > > > csvParser.file();
            > > > >
            > > > > I have spent a couple of hours investigating this,
            > > > > looking through the ANTLR source and stuff but I
            > > > > have not yet found where this is dealt with?
            > > > >
            > > > > I might do a bit of weekend investigation into this
            > > > > because of what I will learn in the process of
            > > > > determining this but at the moment I am supposed to
            > > > > be writing this ANTLR tutorial and then got side
            > > > > tracked trying to explain why it is OK that the
            > > > > parser does not match the final NEWLINE.
            > > > >
            > > > > Well actually, is it ok, or should the rule for file
            > > > > be defined something like:
            > > > >
            > > > > file : (line)+ EOFCHAR;
            > > > >
            > > > > Regards
            > > > >
            > > > > A Person
            > > > >
            > > > >
            > > > >
            > > > >
            > > > > Your use of Yahoo! Groups is subject to
            > > http://docs.yahoo.com/info/terms/
            > > > >
            > > > >
            > > > >
            > > >
            > > >
            > > > __________________________________________________
            > > > Do You Yahoo!?
            > > > HotJobs - Search Thousands of New Jobs
            > > > http://www.hotjobs.com
            >
            >
            >
            >
            > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
            >
            >
            >




            __________________________________________________
            Do You Yahoo!?
            HotJobs - Search Thousands of New Jobs
            http://www.hotjobs.com
          • genericised
            ... This is broken in some way, It generates the output: line 1: unexpected token: null It also has nondeterminisms which I want to avoid, is there anything
            Message 5 of 10 , Aug 15, 2002
              > file : line (NEWLINE line)* (NEWLINE)? EOF
              > line : (record)+ ;
              > record : (r:RECORD) (COMMA)? ;

              This is broken in some way, It generates the output:

              line 1: unexpected token: null

              It also has nondeterminisms which I want to avoid,
              is there anything actually WRONG with me using:

              class CSVParser extends Parser;
              file : (line)+ ;
              line : (record)+ (NEWLINE|EOF);
              record : (r:RECORD) (COMMA)? ;

              I would have thought that if EOF is actually matched
              then this is a perfectly viable way of matching the
              whole file. In fact, IF the EOF IS matched then I see
              no reason NOT to use this way.
            • Bogdan Mitu
              ... Should parse OK correct input. But I m afraid it will also parse incorrect input without producing any error. For instance, try an input like: a, b, c a, ,
              Message 6 of 10 , Aug 15, 2002
                > ...
                > is there anything actually WRONG with me using:
                >
                > class CSVParser extends Parser;
                > file : (line)+ ;
                > line : (record)+ (NEWLINE|EOF);
                > record : (r:RECORD) (COMMA)? ;
                >
                > I would have thought that if EOF is actually matched
                > then this is a perfectly viable way of matching the
                > whole file. In fact, IF the EOF IS matched then I see
                > no reason NOT to use this way.

                Should parse OK correct input. But I'm afraid it will also parse incorrect
                input without producing any error.

                For instance, try an input like:

                a, b, c
                a, , ,

                which I think it's incorrect. I didn't test, but I expect that the parser
                will stop after the first line, without any warning or error.

                Let me know how it works.

                Cheers,
                Bogdan

                >
                >
                >
                > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
                >
                >
                >


                __________________________________________________
                Do You Yahoo!?
                HotJobs - Search Thousands of New Jobs
                http://www.hotjobs.com
              • genericised
                well the lexer is defined like this now: class CSVLexer extends Lexer; options { charVocabulary= 3 .. 377 ; } RECORD : ! (~( , | r | n ))+ ; COMMA :
                Message 7 of 10 , Aug 15, 2002
                  well the lexer is defined like this now:

                  class CSVLexer extends Lexer;
                  options { charVocabulary='\3'..'\377'; }
                  RECORD : '"'! (~(','|'\r'|'\n'))+ ;
                  COMMA : ',' ;
                  NEWLINE : ('\r''\n')=> '\r''\n' //DOS
                  | '\r' //MAC
                  | '\n' //UNIX
                  { newline(); }
                  ;
                  WS : (' '|'\t') { $setType(Token.SKIP); } ;

                  So data is expected to be like:

                  "a, "b, "blah
                  "hei, "fhei, "fhih,

                  so

                  a, b, c
                  a, , ,

                  would produce an error because it is
                  not the correct format anyway, if converted
                  to the correct format:

                  "a, "b, "c
                  "a, ", ",

                  this would also create an error because a
                  record must contain at least one character

                  "a, "b, "c
                  "a, " , " ,

                  would produce no error. Note that this is
                  behaving exactly as it should.

                  Davy Cricket




                  --- In antlr-interest@y..., Bogdan Mitu <bogdan_mt@y...> wrote:
                  > > ...
                  > > is there anything actually WRONG with me using:
                  > >
                  > > class CSVParser extends Parser;
                  > > file : (line)+ ;
                  > > line : (record)+ (NEWLINE|EOF);
                  > > record : (r:RECORD) (COMMA)? ;
                  > >
                  > > I would have thought that if EOF is actually matched
                  > > then this is a perfectly viable way of matching the
                  > > whole file. In fact, IF the EOF IS matched then I see
                  > > no reason NOT to use this way.
                  >
                  > Should parse OK correct input. But I'm afraid it will also parse
                  incorrect
                  > input without producing any error.
                  >
                  > For instance, try an input like:
                  >
                  > a, b, c
                  > a, , ,
                  >
                  > which I think it's incorrect. I didn't test, but I expect that the
                  parser
                  > will stop after the first line, without any warning or error.
                  >
                  > Let me know how it works.
                  >
                  > Cheers,
                  > Bogdan
                  >
                  > >
                  > >
                  > >
                  > > Your use of Yahoo! Groups is subject to
                  http://docs.yahoo.com/info/terms/
                  > >
                  > >
                  > >
                  >
                  >
                  > __________________________________________________
                  > Do You Yahoo!?
                  > HotJobs - Search Thousands of New Jobs
                  > http://www.hotjobs.com
                • Bogdan Mitu
                  ... Of course we can continue like this for ever. But there will always be some incorrect input. If you want to be sure that all input has been parsed, you
                  Message 8 of 10 , Aug 15, 2002
                    --- genericised <trigonometric@...> wrote:
                    > well the lexer is defined like this now: ...

                    Of course we can continue like this for ever. But there will always be some
                    incorrect input. If you want to be sure that all input has been parsed, you
                    have to finish the main rule with EOF. If you don't care, you can leave it
                    like this.

                    Cheers,
                    Bogdan



                    > class CSVLexer extends Lexer;
                    > options { charVocabulary='\3'..'\377'; }
                    > RECORD : '"'! (~(','|'\r'|'\n'))+ ;
                    > COMMA : ',' ;
                    > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
                    > | '\r' //MAC
                    > | '\n' //UNIX
                    > { newline(); }
                    > ;
                    > WS : (' '|'\t') { $setType(Token.SKIP); } ;
                    >
                    > So data is expected to be like:
                    >
                    > "a, "b, "blah
                    > "hei, "fhei, "fhih,
                    >
                    > so
                    >
                    > a, b, c
                    > a, , ,
                    >
                    > would produce an error because it is
                    > not the correct format anyway, if converted
                    > to the correct format:
                    >
                    > "a, "b, "c
                    > "a, ", ",
                    >
                    > this would also create an error because a
                    > record must contain at least one character
                    >
                    > "a, "b, "c
                    > "a, " , " ,
                    >
                    > would produce no error. Note that this is
                    > behaving exactly as it should.
                    >
                    > Davy Cricket
                    >
                    >
                    >
                    >
                    > --- In antlr-interest@y..., Bogdan Mitu <bogdan_mt@y...> wrote:
                    > > > ...
                    > > > is there anything actually WRONG with me using:
                    > > >
                    > > > class CSVParser extends Parser;
                    > > > file : (line)+ ;
                    > > > line : (record)+ (NEWLINE|EOF);
                    > > > record : (r:RECORD) (COMMA)? ;
                    > > >
                    > > > I would have thought that if EOF is actually matched
                    > > > then this is a perfectly viable way of matching the
                    > > > whole file. In fact, IF the EOF IS matched then I see
                    > > > no reason NOT to use this way.
                    > >
                    > > Should parse OK correct input. But I'm afraid it will also parse
                    > incorrect
                    > > input without producing any error.
                    > >
                    > > For instance, try an input like:
                    > >
                    > > a, b, c
                    > > a, , ,
                    > >
                    > > which I think it's incorrect. I didn't test, but I expect that the
                    > parser
                    > > will stop after the first line, without any warning or error.
                    > >
                    > > Let me know how it works.
                    > >
                    > > Cheers,
                    > > Bogdan
                    > >
                    > > >
                    > > >
                    > > >
                    > > > Your use of Yahoo! Groups is subject to
                    > http://docs.yahoo.com/info/terms/
                    > > >
                    > > >
                    > > >
                    > >
                    > >
                    > > __________________________________________________
                    > > Do You Yahoo!?
                    > > HotJobs - Search Thousands of New Jobs
                    > > http://www.hotjobs.com
                    >
                    >
                    >
                    >
                    > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
                    >
                    >
                    >


                    __________________________________________________
                    Do You Yahoo!?
                    HotJobs - Search Thousands of New Jobs
                    http://www.hotjobs.com
                  • genericised
                    Oh, I see well I ll try and get your method working then. Thanks for your help. ... be some ... parsed, you ... leave it ... parse ... the ...
                    Message 9 of 10 , Aug 15, 2002
                      Oh, I see well I'll try and get your method working then. Thanks for
                      your help.

                      --- In antlr-interest@y..., Bogdan Mitu <bogdan_mt@y...> wrote:
                      >
                      > --- genericised <trigonometric@s...> wrote:
                      > > well the lexer is defined like this now: ...
                      >
                      > Of course we can continue like this for ever. But there will always
                      be some
                      > incorrect input. If you want to be sure that all input has been
                      parsed, you
                      > have to finish the main rule with EOF. If you don't care, you can
                      leave it
                      > like this.
                      >
                      > Cheers,
                      > Bogdan
                      >
                      >
                      >
                      > > class CSVLexer extends Lexer;
                      > > options { charVocabulary='\3'..'\377'; }
                      > > RECORD : '"'! (~(','|'\r'|'\n'))+ ;
                      > > COMMA : ',' ;
                      > > NEWLINE : ('\r''\n')=> '\r''\n' //DOS
                      > > | '\r' //MAC
                      > > | '\n' //UNIX
                      > > { newline(); }
                      > > ;
                      > > WS : (' '|'\t') { $setType(Token.SKIP); } ;
                      > >
                      > > So data is expected to be like:
                      > >
                      > > "a, "b, "blah
                      > > "hei, "fhei, "fhih,
                      > >
                      > > so
                      > >
                      > > a, b, c
                      > > a, , ,
                      > >
                      > > would produce an error because it is
                      > > not the correct format anyway, if converted
                      > > to the correct format:
                      > >
                      > > "a, "b, "c
                      > > "a, ", ",
                      > >
                      > > this would also create an error because a
                      > > record must contain at least one character
                      > >
                      > > "a, "b, "c
                      > > "a, " , " ,
                      > >
                      > > would produce no error. Note that this is
                      > > behaving exactly as it should.
                      > >
                      > > Davy Cricket
                      > >
                      > >
                      > >
                      > >
                      > > --- In antlr-interest@y..., Bogdan Mitu <bogdan_mt@y...> wrote:
                      > > > > ...
                      > > > > is there anything actually WRONG with me using:
                      > > > >
                      > > > > class CSVParser extends Parser;
                      > > > > file : (line)+ ;
                      > > > > line : (record)+ (NEWLINE|EOF);
                      > > > > record : (r:RECORD) (COMMA)? ;
                      > > > >
                      > > > > I would have thought that if EOF is actually matched
                      > > > > then this is a perfectly viable way of matching the
                      > > > > whole file. In fact, IF the EOF IS matched then I see
                      > > > > no reason NOT to use this way.
                      > > >
                      > > > Should parse OK correct input. But I'm afraid it will also
                      parse
                      > > incorrect
                      > > > input without producing any error.
                      > > >
                      > > > For instance, try an input like:
                      > > >
                      > > > a, b, c
                      > > > a, , ,
                      > > >
                      > > > which I think it's incorrect. I didn't test, but I expect that
                      the
                      > > parser
                      > > > will stop after the first line, without any warning or error.
                      > > >
                      > > > Let me know how it works.
                      > > >
                      > > > Cheers,
                      > > > Bogdan
                      > > >
                      > > > >
                      > > > >
                      > > > >
                      > > > > Your use of Yahoo! Groups is subject to
                      > > http://docs.yahoo.com/info/terms/
                      > > > >
                      > > > >
                      > > > >
                      > > >
                      > > >
                      > > > __________________________________________________
                      > > > Do You Yahoo!?
                      > > > HotJobs - Search Thousands of New Jobs
                      > > > http://www.hotjobs.com
                      > >
                      > >
                      > >
                      > >
                      > > Your use of Yahoo! Groups is subject to
                      http://docs.yahoo.com/info/terms/
                      > >
                      > >
                      > >
                      >
                      >
                      > __________________________________________________
                      > Do You Yahoo!?
                      > HotJobs - Search Thousands of New Jobs
                      > http://www.hotjobs.com
                    Your message has been successfully submitted and would be delivered to recipients shortly.