Loading ...
Sorry, an error occurred while loading the content.

C# and Unicode problems

Expand Messages
  • tdjastrzebski
    Hi everybody, regardless of charVocabulary option set to u0000 .. uFFFE ; non-ascii characters just disappear from token text or are not being recognized
    Message 1 of 1 , Jul 31, 2003
    • 0 Attachment
      Hi everybody,
      regardless of charVocabulary option set to '\u0000'..'\uFFFE';
      non-ascii characters just disappear from token text or are not being
      recognized when parsing strings like: 'po¿ó³ægêœl¹jaŸñ' (beginning
      and ending with single quotes). Am I missing something? Do I have to
      create antlr.Lexer in any particular way or pass it an input stream?

      Regards,
      Tom Jastrzebski

      sample grammar:

      options {
      language = "CSharp";
      }

      class TestParser extends Parser;

      options {
      k = 2;
      }

      statement
      : StringLiteral EOF
      ;

      class TestLexer extends Lexer;

      options {
      k = 2;
      charVocabulary='\u0000'..'\uFFFE';
      }

      StringLiteral
      : '\'' (~'\'')* '\''
      ;
    Your message has been successfully submitted and would be delivered to recipients shortly.