1010Re: Filename encodings under Win32
- Oct 13, 2003Camillo wrote:
> > Main problem is that sometimes we don't know what the encoding is.A file name may appear in a file (e.g., a list of files in a README
> On Windows? I would disagree here. Any filesystem mounted by Windows
> should be mounted in a way that adheres to Windows naming conventions.
> We're not discussing file contents here.
file). And I don't know what happens with file names on removable media
(e.g., a CD). Probably depends on the file system it contains. And
networked file systems is another problem.
> > In that situation you can treat the filename as a sequence of bytes in mostWe need to locate places where the encoding is different from what a
> > places, but conversion is impossible. This happens more often than you
> > would expect. Put a floppy disk or CD into your computer...
> So why convert it? :) The current display/saving problems stem from the
> fact that the file name is interpreted as UTF-8, a coding which Windows
> does not recognize for file names or strings.
system function expects. There are still a few things that need to be
> > There is also the situation that Vim uses the active codepage, but theIf Vim defaults to the active codepage then conversion to Unicode would
> > file is actually in another encoding that could not be detected. Then
> > doing "gf" on a filename will work if you don't do conversion, but it
> > will fail if you try converting with the wrong encoding in mind.
> AFAIK, Windows will internally convert the path into Unicode if you call
> the ANSI function. Thus if gf succeeds as you describe, it should succeed
> if you use the unicode api as well. In both cases a 8-bit binary string
> undergoes "cp2unicode" conversion.
do the same as using the ANSI function. Thus it's only a problem when
'encoding' is different from the active codepage. And when 'encoding'
is a Unicode variant we can use the "W" functions. Still, this means
all fopen() and stat() calls must be adjusted. When 'encoding' is not
the active codepage we could either leave the file name untranslated (as
it's now) or convert it to Unicode. Don't know which one would work
> > Your active codepage must be latin1 then. Vim gets the default from thecp1252 and latin1 are not identical, but for practical use they can be
> > active codepage.
> My code page is cp1252. It's not latin1 (iso-8859-1). In practice, both
> are 8-bit-raw.
handled as the same encoding. Vim indeed uses this as the "raw" 8-bit
encoding that avoids messing up your characters when you don't know what
encoding it actually is.
hundred-and-one symptoms of being an internet addict:
194. Your business cards contain your e-mail and home page address.
/// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
/// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
\\\ Project leader for A-A-P -- http://www.A-A-P.org ///
\\\ Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html ///
- << Previous post in topic Next post in topic >>