Loading ...
Sorry, an error occurred while loading the content.
 

Parsing files in different charsets

Expand Messages
  • charon_hades
    Hi, how can I make ANTLR parse files containing different codepage as is its system codepage. For clarity, my system codepage is Cp1250, files contains string
    Message 1 of 1 , Sep 30, 2002
      Hi,

      how can I make ANTLR parse files containing different codepage as is
      its system codepage. For clarity, my system codepage is Cp1250, files
      contains string in Cp852 though identifiers are just from plain ASCII.
      My problem is, that strings returned from calling getText method
      contains unrecognized characters.

      If is enough to provide ANTLRLexer java.io.Reader reading in Cp852 ?
      How will be string tokens encoded after I will call getText on them ?
      If I am correct, then with these settings all charcters and tokens
      listed in grammar files have to be written in Cp852 ?

      Or better way is to translate whole input stream into UTF8 and in
      this codeset also write grammar file ?

      Thanks.
    Your message has been successfully submitted and would be delivered to recipients shortly.