1004Re: Filename encodings under Win32
- Oct 13, 2003Camillo wrote:
> > Vim should support UTF-8 in 9x, too.On Windows NT/XP there are also restrictions, especially when using
> Of course, but with the necessary restrictions. Displaying unicode is a
> problem, as is entering filenames. Those functions are restricted to the
> ACP on Win9x.
non-NTFS filesystems. There was a discussion about this in the Linux
UTF-8 maillist a long time ago. There was no good universal solution
for handling filenames that they could come up with.
Vim could use Unicode functions for accessing files, but this will be a
huge change. Requires lots of testing. Main problem is when 'encoding'
is not a Unicode encoding, then conversions need to be done, which may
If you use filenames that cannot be represented in the active codepage,
you probably have problems with other programs. Thus sticking with the
active codepage functions isn't too bad. But then Vim needs to convert
from 'encoding' to the active codepage!
> It is a bugfix. Currently, when using UTF-8 on WinNT, vim is broken in (atThe file names are handled as byte strings. Thus so long as you use the
> least) the following regards:
> - Opening non-ascii filenames, regardless of codepage
> Ã¥.txt internally becomes <e5>.txt
> - Saving filenames
> Ã¥.txt is saved in UTF-8 format (ÃÂ¥.txt) and displayed incorrectly in
> title bar
right bytes it should work. Problem is when you are typing/editing with
a different encoding from the active codepage.
> - The default termencoding should be set intelligently, UTF-8 asWhy would 'termencoding' be "utf-8"? This won't work, unless you are
> termencoding breaks input of non-ascii.
using an xterm on MS-Windows. The default 'termencoding' is empty,
which means 'encoding' is used. There is no better default. When you
change 'encoding' you might have to change 'termencoding' as well, but
this depends on your situation.
> - The default fileencoding breaks when "going UTF-8", most probably a'fileencoding' is set when reading a file. Perhaps you mean
> better behavior would be to default to the ACP always.
'fileencodings'? This one needs to be tweaked by the user, because it
depends on what kind of files you edit. Main problem is that an ASCII
file can be any encoding, Vim can't detect what it is, thus the user has
to specify what he wants Vim to do with it.
> - Also, my vim (6.2) defaults to "latin1", not my current codepage. ThatWhere does it use "latin1"? Not in 'encoding', I suppose.
> would indicate that the ACP detection does not work.
> OK, the list above sounds like whining, but earlier I did suggest that theMostly it's quite more complicated. Different users have different
> fixes are fairly straightforward.
situations, it is hard to think of solutions that work for most people.
> On WinNT, vim should use unicode apis, essentially benefittingThe problem is that conversions to/from Unicode only work when you know
> automatically from NT native Unicode. This only involves one additional
> encoding/decoding step before calling the apis.
the encoding of the text you are converting. The encoding isn't always
known. Vim sometimes uses "latin1", so that you at least get 8-bit
clean editing, even though the actual encoding is unknown.
> On Win9x, vim should use ANSI apis. The only thing missing is again theYou can use a few Unicode functions on Win9x, we already do. I don't
> encoding/decoding, although it's trickier with the ANSI apis. There are
> many cases where an user would enter UTF-8 stuff that doesn't smootly
> convert to the current CP. I think vim's current code should detect that
see a reason to change this.
I'm in shape. Round IS a shape.
/// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
/// Creator of Vim - Vi IMproved -- http://www.Vim.org \\\
\\\ Project leader for A-A-P -- http://www.A-A-P.org ///
\\\ Help AIDS victims, buy here: http://ICCF-Holland.org/click1.html ///
- << Previous post in topic Next post in topic >>