994Re: Filename encodings under Win32
- Oct 12, 2003Glenn Maynard <glenn@...> wrote:
> On Sun, Oct 12, 2003 at 10:44:05PM +0200, Tony Mechelynck wrote:Trivial or not, my opinion is that handling files and keypresses as per the
> > As long as 'fileencoding', 'printencoding' and (most important)
> > 'termencoding' default (when empty) to whatever is the current
> > value of 'encoding', the latter must not (IMHO) be set to UTF-8 by
> > default.
> > (Let's spell it out) In my humble opinion, Vim should require as
> > little "tuning" as possible to handle the language interfaces the
> > same way as the operating system does, and this means that, when
> > the user sets nothing else in his startup and configuration files,
> > keyboard input, printer output and file creation should default to
> > whatever is set in the locale.
> This is a trivial fix, which I already proposed many months ago: the
> defaults in Windows should be the results of
> exe "set fileencodings=ucs-bom,utf-8,cp" . getacp() . ",latin1"
> exe "set fileencoding=cp" . getacp()
> and now adding:
> exe "set printencoding=cp" . getacp()
> Note that "getacp" is a function in a patch I sent which was lost or
> forgotton: return the ANSI codepage.
> (A slightly safer default would be to remove "utf-8" from the search,
> to prevent false matches.) I havn't found any problems with this;
> it's been
> my default for a long time and I actively edit UTF-8 and CP932 files.
locale shouldn't be a "fix", it should be the (program) default. The "minor
fix" consists of making Unicode the (user's) default by means of a config
setting; but see below about that.
>Sorry, but it is. AFAIK, leaving 'termencoding' empty when switching
> > If the user wants to handle Unicode files, is is quite possible to
> > set gvim to do it, even in Win98 systems like mine; but this
> > requires, among other things, storing the previous value of
> > 'encoding' into 'termencoding' because the user cannot, by a mere
> > snap of the fingers, change his keyboard input from some national
> > encoding to Unicode.
> The input in a Windows window is well-defined; "termencoding" should
> even be needed in Windows. Depending on which messages are trapped,
> the input is always in the ANSI codepage or Unicode.
'encoding' over from something else to Unicode produces dysfunctions in the
keyboard for all users whose actual keyboard encoding is other than 7-bit
ASCII -- roughly speaking, for all users with a keyboard for a language
other than English (even Dutchmen like Bram need, as a minimum, the
"lowercase e with diaeresis", which is over 128, and therefore receives a
different representation in UTF-8 and in other encodings -- the codepoint
number maybe the same but it is not represented identically). That's why the
if &termencoding == ""
let &termencoding = &encoding
have been put in my script set_utf8.vim (newly uploaded to vim.online),
before the actual switch of 'encoding' ro utf-8. Thanks to this, any
accented keys (and my own keyboard has a lot of them) go on working
identically (i.e., transparently) after the switchover as they did before.
Of course, making utf-8 the vim default for 'encoding' would break the above
code, with (AFAIK) no possibility of repair in mainline Vim (which hasn't
got the getacp() function -- and don't talk to me about a patch, I don't
want to use other than standard binaries; for one thing, I don't have a
compiler and I don't want to get one: messing about with nonstandard
compilations is definitely not my cup o'tea). It would break it, I mean,
unless the vim default for 'termencoding' would change from the empty string
(i.e. use whatever is the current global Vim 'encoding' at the time a key is
pressed) to the user's locale (as found in $LANG at startup). But let's keep
things simple, not break existing scripts, reduce Bram and other people's
workloads, and keep Vim's handling of encodings as it is (the only change
I'd like to see is to add a functioning 'printencoding' option to Windows
versions of gvim, even though they don't print through PostScript).
>Users who only edit files in a single 8 bit encoding don't need to bother
> However, if it's being used anyway for some reason, then the solution
> the same:
> exe "set termencoding=cp" . getacp()
> The only reason I know of not to set "encoding" to "utf-8" is that Vim
> doesn't do proper conversions for Win32 calls.
about Unicode. For others, it is a useful choice, but I maintain that it
should remain a choice, and, if the locale set in the operating system is
not a Unicode one, it should IMHO remain a conscious choice (or at least a
voluntary one, that need not stay conscious once it has been written into
>UTF-8 is fully supported (well, almost fully: characterwise
> > used by (g)vim (namely, 'encoding', 'fileencoding', 'termencoding'
> > and 'printencoding', as well as a possible 8-bit encoding at the
> > end of 'fileencodings') should, as I believe they already do,
> > default directly or indirectly to whatever is set in the locale,
> > and that a possible switchover to Unicode should be left to the
> > voluntary and reasoned choice of the user.
> Switching "encoding" to "utf-8" should be transparent, once proper
> conversions for win32 calls are in place. Regular users don't care
> about what encoding their editor uses internally, any more than they
> care about what type of data structures they use.
> On the other hand, if utf-8 internally is fully supported, then utf-8
> can be the *only* internal encoding--which would make the rendering
> code much simpler and more robust. I remember finding lots of little
> errors in the renderer (eg. underlining glitches for double-width
> characters) that went away with utf-8, and I don't think Vim renders
> correctly at all if eg. "encoding" is set to "cp1242" and the ACP
> is CP932 (needs a double conversion).
> Glenn Maynard
bidirectionality, a Unicode property, isn't supported) internally by
multi-byte versions of gvim, but switching over "transparently" from
"locale-oriented" to "Unicode-oriented" working requires careful attention
to several options, foremost of which are 'termencoding' and
'fileencodings'. To help the ordinary Vim user make that switchover
"transparently" without (as we say in French) "getting his feet caught in
the carpet", I uploaded a few minutes ago a new script called set_utf8.vim :
go see it at http://vim.sourceforge.net/scripts/script.php?script_id=789 .
With it and a Unicode-enabled version of Vim (with no need for any special
patches), switching over from one's national locale to Unicode becomes a
one-liner (you may call it a "trivial fix"). The idea of that script is to
work as "transparently" as possible, e.g., to avoid messing up the existing
keyboard's or (if possible) printer's interpretation of accented characters.
- << Previous post in topic Next post in topic >>