Loading ...
Sorry, an error occurred while loading the content.

Re: hi (filecharcode UTF-8 question)

Expand Messages
  • Bram Moolenaar
    ... That is very welcome. I have implemented most of this, but I don t use it myself (other than a few simple tests). I m currently working on recognizing
    Message 1 of 2 , Nov 28, 2000
    • 0 Attachment
      Margin Norb�ck wrote:

      > I am new to this list. I just discovered the fabulous filecharcode
      > feature of vim 6.
      >
      > I would love to bugtest, and patch vim, especially the UTF-8 features.

      That is very welcome. I have implemented most of this, but I don't use it
      myself (other than a few simple tests). I'm currently working on recognizing
      the BOM (mostly for MS-Windows, which uses it for little-endian UCS-2 files).

      > I have question as well :)
      >
      > I am into translating .po-files, and some of them are in iso-8859-1
      > format and others are in UTF-8 format.
      >
      > To know which format they are in, you have to look through them, but
      > setting the filecharcode option only affects reading and writing, and by
      > the time I know the charcode, the file has already been read.
      >
      > Is there a smooth way to fix this (apart from rereading the file?)
      >
      > Setting filecharcodes=utf-8,latin-1 works partly, but I want to use
      > utf-8 even if the .po-file is not correct UTF-8, if the charset header
      > of the file says it is UTF-8.

      Hmm, the automatic detection is based on the idea that if it's not correct
      UTF-8 then it's Latin-1.

      You need some way to tell Vim which 'filecharcode' to use for this file.
      Since you want to get this information from the file itself you have a
      chicken-egg problem: You need to read the file to see how to read the file.

      I think you can only solve this by forcing the file to be read with
      'filecharcode' empty, thus reading it as UTF-8, and then use a BufReadPost
      autocommand to read the file again in latin-1 if it's needed (e.g., when you
      match a pattern in the first ten lines). Something like this (untested!):

      :set filecharcodes=
      :au BufReadPost *.po if getline(1) =~ "ISO_8859-1" |
      \ edit ++cc=latin-1 | endif

      --
      LAUNCELOT: At last! A call! A cry of distress ...
      (he draws his sword, and turns to CONCORDE)
      Concorde! Brave, Concorde ... you shall not have died in vain!
      CONCORDE: I'm not quite dead, sir ...
      "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

      /// Bram Moolenaar Bram@... http://www.moolenaar.net \\\
      \\\ Vim: http://www.vim.org ICCF Holland: http://iccf-holland.org ///
    Your message has been successfully submitted and would be delivered to recipients shortly.