Loading ...
Sorry, an error occurred while loading the content.

RE: Forcing vim 6.0 to stay in UTF-8 mode in a UTF-8 locale

Expand Messages
  • Maiorana, Jason
    ... exist me too :) ... in my ~/.vimrc i use: set encoding=utf-8 I had no problem opening any of those files: vim (6.1) stayed in utf-8, so that at least
    Message 1 of 3 , Aug 19, 2002
    • 0 Attachment
      Markus Kuhn wrote:
      >I live now on a planet were any other encoding than UTF-8 does not
      exist

      me too :)


      >when I am in LC_CTYPE=en_GB.UTF-8. How do I tell vim 6.0 (and also
      >emacs) to pick the encoding *strictly* based on the locale and look at
      >absolutely nothing else? Falling back to ISO 8859-1 is not an option,
      >because ISO 8859-1 is completely unknown on my planet.

      in my ~/.vimrc i use:
      "
      set encoding=utf-8
      "

      I had no problem opening any of those files: vim (6.1) stayed in utf-8,
      so that at least should work around your vim problem. (I dont have a
      utf-8 locale handy, so I didnt try it the way you are getting it to
      fail)


      Imo, using locales to specify encodings is thoroughly
      outdated/deprecated. All text strings should be in utf-8 all
      the time, and when they are not, they should be converted
      into utf-8 asap. (for date, currency, and collation,
      locales are fine).

      Auto detection of encodings is likely equally useless, BOMs
      should be ditched/ignored, and UTF-16 should be illegal :)
    • Bram Moolenaar
      ... It s all documented, but it s a bit complicated. This is required to make it work in all situations. You did read the docs for fileencodings ? It
      Message 2 of 3 , Aug 20, 2002
      • 0 Attachment
        Glenn Maynard wrote:

        > Well, is this exact?
        >
        > My default fenc is "cp1252" (as I'm using the test setting I mentioned).
        >
        > If I load a UTF-8 file, fenc becomes UTF-8.
        >
        > But, if I then :new, the new window is created with fenc=cp1252, despite
        > fenc being UTF-8.
        >
        > Doing a :set fenc in each window then shows that it's different for
        > each, but :new always creates fenc=cp1252.
        >
        > This makes me conclude that there's a "global" fenc, which determines
        > the default encoding of new files, and a "local" fenc to each window,
        > marking the encoding of that file.
        >
        > That's fine, except it seems undocumented, and it's not clear how to
        > explicitely set the "global" fenc versus the current "local" one.

        It's all documented, but it's a bit complicated. This is required to
        make it work in all situations. You did read the docs for
        'fileencodings'? It explains what happens when opening a file. For a
        new file the explanation is at ":help local-options". Obviously 'fenc'
        is a buffer-local option. The info about local options isn't repeated
        for every option, you need to read the introduction.

        > > You probably want to set 'fileencodings' to "utf-8" or make it empty.
        > > Then Vim won't check for a BOM or fall back to using latin1. You still
        > > get CONVERSION ERRORs when editing a file with an illegal byte sequence,
        > > and that's a good hint for the user.
        >
        > It'll also set the file readonly, though, which probably isn't wanted
        > here.

        For a real conversion error this is appropriate. An error while reading
        a file always makes it marked as read-only to prevent you from
        accidentally overwriting the original file with an errornous version.
        But in this specific case the file can be written unmodified, thus it
        doesn't need to be marked read-only. I'll see if this situation can be
        detected reliably.

        --
        Emacs is a nice OS - but it lacks a good text editor.
        That's why I am using Vim. --Anonymous

        /// Bram Moolenaar -- Bram@... -- http://www.moolenaar.net \\\
        /// Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim \\\
        \\\ Project leader for A-A-P -- http://www.a-a-p.org ///
        \\\ Lord Of The Rings helps Uganda - http://iccf-holland.org/lotr.html ///
      • Tony Mechelynck
        ... From: Bram Moolenaar To: Glenn Maynard Cc: ; Sent: Tuesday,
        Message 3 of 3 , Aug 20, 2002
        • 0 Attachment
          ----- Original Message -----
          From: "Bram Moolenaar" <Bram@...>
          To: "Glenn Maynard" <g_lutf8@...>
          Cc: <linux-utf8@...>; <vim-multibyte@...>
          Sent: Tuesday, August 20, 2002 11:16 AM
          Subject: Re: Forcing vim 6.0 to stay in UTF-8 mode in a UTF-8 locale


          >
          > Glenn Maynard wrote:
          >
          > > Well, is this exact?
          > >
          > > My default fenc is "cp1252" (as I'm using the test setting I mentioned).
          > >
          > > If I load a UTF-8 file, fenc becomes UTF-8.
          > >
          > > But, if I then :new, the new window is created with fenc=cp1252, despite
          > > fenc being UTF-8.
          > >
          > > Doing a :set fenc in each window then shows that it's different for
          > > each, but :new always creates fenc=cp1252.
          > >
          > > This makes me conclude that there's a "global" fenc, which determines
          > > the default encoding of new files, and a "local" fenc to each window,
          > > marking the encoding of that file.
          > >
          > > That's fine, except it seems undocumented, and it's not clear how to
          > > explicitely set the "global" fenc versus the current "local" one.
          >
          > It's all documented, but it's a bit complicated. This is required to
          > make it work in all situations. You did read the docs for
          > 'fileencodings'? It explains what happens when opening a file. For a
          > new file the explanation is at ":help local-options". Obviously 'fenc'
          > is a buffer-local option. The info about local options isn't repeated
          > for every option, you need to read the introduction.

          Hint: You might want to :setglobal fenc=utf-8
          >
          > > > You probably want to set 'fileencodings' to "utf-8" or make it empty.
          > > > Then Vim won't check for a BOM or fall back to using latin1. You
          still
          > > > get CONVERSION ERRORs when editing a file with an illegal byte
          sequence,
          > > > and that's a good hint for the user.

          Checking for a BOM is probably not harmful. The BOM value has been chosen so
          that the BOM for any Unicode encoding including UTF-8 is illegal at the
          start of a file in any other Unicode encoding. (IIRC it's FE00, "zero-width
          non-breaking space". And the Unicode standard allows it on utf-8 files -- if
          only to tell that they are utf-8 and not utf-16 or utf-32.)
          > >
          > > It'll also set the file readonly, though, which probably isn't wanted
          > > here.
          >
          > For a real conversion error this is appropriate. An error while reading
          > a file always makes it marked as read-only to prevent you from
          > accidentally overwriting the original file with an errornous version.
          > But in this specific case the file can be written unmodified, thus it
          > doesn't need to be marked read-only. I'll see if this situation can be
          > detected reliably.
          >
          > --
          > Emacs is a nice OS - but it lacks a good text editor.
          > That's why I am using Vim. --Anonymous
          >
          > /// Bram Moolenaar -- Bram@... -- http://www.moolenaar.net
          \\\
          > /// Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim
          \\\
          > \\\ Project leader for A-A-P -- http://www.a-a-p.org
          ///
          > \\\ Lord Of The Rings helps Uganda - http://iccf-holland.org/lotr.html
          ///
          >

          Regards,
          Tony.
        Your message has been successfully submitted and would be delivered to recipients shortly.