Loading ...
Sorry, an error occurred while loading the content.

667Re: mbyte.c patch

Expand Messages
  • Bram Moolenaar
    Jul 18, 2002
      Glenn Maynard wrote:

      > On Wed, Jul 17, 2002 at 12:54:07PM +0200, Bram Moolenaar wrote:
      > > This indeed looks very broken. ImeGetTempComposition() will always
      > > return NULL. It already was like this in 6.0. I didn't hear specific
      > > complaints about this. I suppose this means we might as well remove
      > > this code.
      > Alright. Patch attached:
      > gui_w32.c:
      > Remove HanExtTextOut, bInComposition, ImeGetTempComposition,
      > DisplayCompStringOpaque.
      > Removed a now-unneeded pair of braces from gui_mch_draw_string. I
      > didn't unindent the block of code between it, since that'd bloat the
      > patch with stuff that looks like changes but isn't. (I wish patch
      > could handle this better.) I'll let you do that.

      Ehm, this removes HanExtTextOut() completely. I'm not sure if that is a
      good idea, it wan't added for fun. But it should certainly be called in
      a different way, and ImeGetTempComposition() can be deleted. The check
      for the sysfixed size should be done before calling HanExtTextOut(), and
      the call to ExtTextOut() inside HanExtTextOut() removed. This is a lot
      safer than deleting that code without knowing why it was there.

      > Remaining problems in the renderer:
      > SBCS encodings should render with "is_funky_dbcs"; right now we'll get
      > garbage unless the encoding matches the system.

      I suppose the flag should be "non_system_codepage", which is set when
      using an 'encoding' different from the active codepage. I suppose it's
      not useful for Unicode values of 'encoding' though.

      I can't overview all the other remarks. I hope the people who worked on
      this code can respond. There is a lot of trial-and-error stuff in here,

      > I think these problems are ultimately because there are too many code
      > paths. I'd suggest always converting the string to Unicode at the start
      > of the function. (At least in NT, it shouldn't be a speed hit, since I
      > believe the font API will convert to Unicode anyway.) The RL (RevOut)
      > special case needs to be done in Unicode before this will work. (Which
      > I have code for, but am holding off on in the interests of keeping the
      > patch down.)

      That's quite a drastic change. I'm not really sure it's worth taking
      the risc.

      > > Perhaps it goes wrong when the codepage is set wrong? This is
      > > especially for 8-bit codepages where you probably don't notice the
      > > mistake if you do use the right font. Conversion to Unicode will reveal
      > > the problem.
      > It'd cause problems with the clipboard, though.

      Unless the program on the other side has the same (reverse) problem.

      > Hmm. I think the default encodings could be improved a bit. For
      > example, if a user tries to load a file, and it fails, they're likely
      > to first change "encoding". Of course, changing that alone isn't
      > correct, but it may happen to work. Or, they may change just
      > "fileencodings", which is closer, but that probably won't work, since
      > you can't convert most other encodings to latin1. You have to change two
      > values, and it takes a bit of reading to figure out exactly what to do.
      > (Enough that a lot of users are likely to get it wrong.)

      Changing 'encoding' should be discouraged, because it invalidates text
      in other buffers, registers, etc. Best is to use something like:

      :edit ++enc=cp123 file

      Setting 'fileencodings' to a good value would also work. This mostly
      requires a Unicode value for 'encoding'.

      > First, I'd change the default "encodings" to UTF-8 in Windows. "latin1"
      > is only a reasonable default if the system encoding happens to be one
      > that's like latin1; UTF-8 is almost always a better default (especially
      > since most users should be able to leave it alone.)

      The problem with using "utf-8" as a default is that a new file will get
      this encoding, which is probably not what people expect. I think a new
      file should probably use the active codepage as a default, unless the
      user has selected something else.

      Also don't forget that what we call "latin1" is actually any 8-bit
      encoding. It doesn't have to be the right name, it only matters when
      doing conversions. Quite often we don't actually know what encoding is
      being used and fall back to using latin1. This applies more to Unix
      than MS-Windows though, since MS-Windows supplies a function to obtain
      the active codepage.

      > Second, I'd change the default Unicode fencs from "ucs-bom,utf-8,latin1"
      > to "ucs-bom,utf-8,CP####" in Windows. Again, latin1 is only a
      > reasonable default for latin1 users; setting it intelligently should
      > work for a lot more people without changes. (I think this should let
      > most people edit locally-encoded files without having to touch encoding-
      > related settings at all.)

      Same issue. Instead of "latin1" the active codepage could be used, if
      it is an 8-bit encoding. Otherwise it must be "latin1", because we
      always need to fall back to an 8-bit encoding.

      > > > I'll work on making fileio use MBtoWC for codepage<->Unicode conversion
      > > > when possible. Even if you leave encodings as is, this will help.
      > Attached.

      I'll look into this later.

      You have heard the saying that if you put a thousand monkeys in a room with a
      thousand typewriters and waited long enough, eventually you would have a room
      full of dead monkeys.
      (Scott Adams - The Dilbert principle)

      /// Bram Moolenaar -- Bram@... -- http://www.moolenaar.net \\\
      /// Creator of Vim -- http://vim.sf.net -- ftp://ftp.vim.org/pub/vim \\\
      \\\ Project leader for A-A-P -- http://www.a-a-p.org ///
      \\\ Help me helping AIDS orphans in Uganda - http://iccf-holland.org ///
    • Show all 14 messages in this topic