Loading ...
Sorry, an error occurred while loading the content.

668Re: mbyte.c patch

Expand Messages
  • Glenn Maynard
    Jul 18, 2002
    • 0 Attachment
      On Thu, Jul 18, 2002 at 10:02:00PM +0200, Bram Moolenaar wrote:
      > Ehm, this removes HanExtTextOut() completely. I'm not sure if that is a
      > good idea, it wan't added for fun. But it should certainly be called in
      > a different way, and ImeGetTempComposition() can be deleted. The check
      > for the sysfixed size should be done before calling HanExtTextOut(), and
      > the call to ExtTextOut() inside HanExtTextOut() removed. This is a lot
      > safer than deleting that code without knowing why it was there.

      Well, there are two parts to it:

      One's the fake-backslash code. I don't think that's worth keeping as
      is, since it's very special-case, and like you said, most Japanese users
      don't mind (expect, actually) the Yen sign. (I do have a replacement,
      since *I* mind the Yen sign, but it's not urgent.)

      The important one is the Korean IME code. I'm not sure if that's needed at
      all. I'm *guessing* that originally, the partially-composed characters
      weren't being displayed, and this was hacked in. However, it's wrong; the
      right way is to tell the IME where the cursor is so it renders it. This is
      being done now, due to the IME patch, so I believe this is completely obsolete.

      Of course, that's a guess. Could we contact whoever wrote this patch
      and ask them if the current Vim release is working for him in the IME?
      (Searching the archive for Sung-Hoon Baek didn't help.)

      > I suppose the flag should be "non_system_codepage", which is set when
      > using an 'encoding' different from the active codepage. I suppose it's
      > not useful for Unicode values of 'encoding' though.

      I'd say "not_system_codepage". (Unicode is not a system codepage; it's not
      a non-system-codepage.)

      > > I think these problems are ultimately because there are too many code
      > > paths. I'd suggest always converting the string to Unicode at the start
      > > of the function. (At least in NT, it shouldn't be a speed hit, since I
      > That's quite a drastic change. I'm not really sure it's worth taking
      > the risc.

      It could be done piecemeal over a number of releases. However, if
      encoding can default to UTF-8, then all that's really important is
      getting UTF-8 rendering right in all cases, and that's something that
      should be done anyway.

      > The problem with using "utf-8" as a default is that a new file will get
      > this encoding, which is probably not what people expect. I think a new
      > file should probably use the active codepage as a default, unless the
      > user has selected something else.

      New files get the "fileencoding" encoding, right? Make fileencoding do
      this, too, then.

      > Also don't forget that what we call "latin1" is actually any 8-bit
      > encoding. It doesn't have to be the right name, it only matters when
      > doing conversions. Quite often we don't actually know what encoding is
      > being used and fall back to using latin1. This applies more to Unix
      > than MS-Windows though, since MS-Windows supplies a function to obtain
      > the active codepage.

      However, conversions are important. It's pretty confusing if Vim is
      appearing to load and display a file correctly, but sends junk to the
      clipboard because the conversion was done incorrectly.

      > Same issue. Instead of "latin1" the active codepage could be used, if
      > it is an 8-bit encoding. Otherwise it must be "latin1", because we
      > always need to fall back to an 8-bit encoding.

      What about fencs="ucs-bom,utf-8,CP####,latin1"?

      Glenn Maynard
    • Show all 14 messages in this topic