1018Re: Filename encodings under Win32
- Oct 15, 2003Glenn Maynard wrote:
> On Tue, Oct 14, 2003 at 02:20:27PM +0200, Bram Moolenaar wrote:It's more complicated then that. You can have filenames in the ACP,
> > This is still complicated, but probably requires less changes than using
> > Unicode functions for all file access. I only foresee trouble when
> > 'encoding' is set to a non-Unicode codepage different from the active
> > codepage and using a filename that contains non-ASCII characters.
> > Perhaps this situation is too weird to take into account?
> If "encoding" is not the ACP codepage, then the main problem is that the
> user can enter characters that Vim simply can't put into a filename
> (and in 9x, that the system can't, either).
> I'd just do a conversion, and if the conversion fails, warn appropriately.
'encoding' and Unicode. Filenames are stored in various places inside
Vim, which encoding is used for each of them? Obviously, a filename
stored in buffer text and registers has to use 'encoding'.
It's less obvious what to use for internal structures, such as
curbuf->b_ffname. When 'encoding' is a Unicode encoding we can use
UTF-8, that can be converted to anything else. That also works when the
active codepage is not Unicode, we can use the wide functions then.
When 'encoding' is the active codepage (this is the default, should
happen a lot), we can use the active codepage. That avoids conversions
(which may fail). No need to use wide functions then.
The real problem is when 'encoding' is not the active codepage and it's
also not a Unicode encoding. We could simply skip the conversion then.
That doesn't work properly for non-ASCII characters, but it's how it
already works right now. The right way would be to convert the file
name to Unicode and use the wide functions.
I guess this means all filenames inside Vim are in 'encoding'. Where
needed, conversion needs to be done from/to Unicode and the wide
functions are to be used then.
The main thing to implement now is using the wide functions when
'encoding' is UTF-8. This only requires a simple conversion between
UTF-8 and UCS-16. I'll be waiting for a patch...
hundred-and-one symptoms of being an internet addict:
231. You sprinkle Carpet Fresh on the rugs and put your vacuum cleaner
in the front doorway permanently so it always looks like you are
actually attempting to do something about that mess that has amassed
since you discovered the Internet.
/// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
/// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
\\\ Project leader for A-A-P -- http://www.A-A-P.org ///
\\\ Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html ///
- << Previous post in topic