Loading ...
Sorry, an error occurred while loading the content.

1018Re: Filename encodings under Win32

Expand Messages
  • Bram Moolenaar
    Oct 15, 2003
      Glenn Maynard wrote:

      > On Tue, Oct 14, 2003 at 02:20:27PM +0200, Bram Moolenaar wrote:
      > > This is still complicated, but probably requires less changes than using
      > > Unicode functions for all file access. I only foresee trouble when
      > > 'encoding' is set to a non-Unicode codepage different from the active
      > > codepage and using a filename that contains non-ASCII characters.
      > > Perhaps this situation is too weird to take into account?
      > If "encoding" is not the ACP codepage, then the main problem is that the
      > user can enter characters that Vim simply can't put into a filename
      > (and in 9x, that the system can't, either).
      > I'd just do a conversion, and if the conversion fails, warn appropriately.

      It's more complicated then that. You can have filenames in the ACP,
      'encoding' and Unicode. Filenames are stored in various places inside
      Vim, which encoding is used for each of them? Obviously, a filename
      stored in buffer text and registers has to use 'encoding'.

      It's less obvious what to use for internal structures, such as
      curbuf->b_ffname. When 'encoding' is a Unicode encoding we can use
      UTF-8, that can be converted to anything else. That also works when the
      active codepage is not Unicode, we can use the wide functions then.

      When 'encoding' is the active codepage (this is the default, should
      happen a lot), we can use the active codepage. That avoids conversions
      (which may fail). No need to use wide functions then.

      The real problem is when 'encoding' is not the active codepage and it's
      also not a Unicode encoding. We could simply skip the conversion then.
      That doesn't work properly for non-ASCII characters, but it's how it
      already works right now. The right way would be to convert the file
      name to Unicode and use the wide functions.

      I guess this means all filenames inside Vim are in 'encoding'. Where
      needed, conversion needs to be done from/to Unicode and the wide
      functions are to be used then.

      The main thing to implement now is using the wide functions when
      'encoding' is UTF-8. This only requires a simple conversion between
      UTF-8 and UCS-16. I'll be waiting for a patch...

      hundred-and-one symptoms of being an internet addict:
      231. You sprinkle Carpet Fresh on the rugs and put your vacuum cleaner
      in the front doorway permanently so it always looks like you are
      actually attempting to do something about that mess that has amassed
      since you discovered the Internet.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html ///
    • Show all 29 messages in this topic