Re: mbyte.c patch
- On Thu, Jul 18, 2002 at 10:02:00PM +0200, Bram Moolenaar wrote:
> Ehm, this removes HanExtTextOut() completely. I'm not sure if that is aWell, there are two parts to it:
> good idea, it wan't added for fun. But it should certainly be called in
> a different way, and ImeGetTempComposition() can be deleted. The check
> for the sysfixed size should be done before calling HanExtTextOut(), and
> the call to ExtTextOut() inside HanExtTextOut() removed. This is a lot
> safer than deleting that code without knowing why it was there.
One's the fake-backslash code. I don't think that's worth keeping as
is, since it's very special-case, and like you said, most Japanese users
don't mind (expect, actually) the Yen sign. (I do have a replacement,
since *I* mind the Yen sign, but it's not urgent.)
The important one is the Korean IME code. I'm not sure if that's needed at
all. I'm *guessing* that originally, the partially-composed characters
weren't being displayed, and this was hacked in. However, it's wrong; the
right way is to tell the IME where the cursor is so it renders it. This is
being done now, due to the IME patch, so I believe this is completely obsolete.
Of course, that's a guess. Could we contact whoever wrote this patch
and ask them if the current Vim release is working for him in the IME?
(Searching the archive for Sung-Hoon Baek didn't help.)
> I suppose the flag should be "non_system_codepage", which is set whenI'd say "not_system_codepage". (Unicode is not a system codepage; it's not
> using an 'encoding' different from the active codepage. I suppose it's
> not useful for Unicode values of 'encoding' though.
> > I think these problems are ultimately because there are too many codeIt could be done piecemeal over a number of releases. However, if
> > paths. I'd suggest always converting the string to Unicode at the start
> > of the function. (At least in NT, it shouldn't be a speed hit, since I
> That's quite a drastic change. I'm not really sure it's worth taking
> the risc.
encoding can default to UTF-8, then all that's really important is
getting UTF-8 rendering right in all cases, and that's something that
should be done anyway.
> The problem with using "utf-8" as a default is that a new file will getNew files get the "fileencoding" encoding, right? Make fileencoding do
> this encoding, which is probably not what people expect. I think a new
> file should probably use the active codepage as a default, unless the
> user has selected something else.
this, too, then.
> Also don't forget that what we call "latin1" is actually any 8-bitHowever, conversions are important. It's pretty confusing if Vim is
> encoding. It doesn't have to be the right name, it only matters when
> doing conversions. Quite often we don't actually know what encoding is
> being used and fall back to using latin1. This applies more to Unix
> than MS-Windows though, since MS-Windows supplies a function to obtain
> the active codepage.
appearing to load and display a file correctly, but sends junk to the
clipboard because the conversion was done incorrectly.
> Same issue. Instead of "latin1" the active codepage could be used, ifWhat about fencs="ucs-bom,utf-8,CP####,latin1"?
> it is an 8-bit encoding. Otherwise it must be "latin1", because we
> always need to fall back to an 8-bit encoding.