Loading ...
Sorry, an error occurred while loading the content.

Re: 6.3a on Win32: better utf-8 support, but...

Expand Messages
  • Bram Moolenaar
    ... I know they were in utf-8, I just meant to say the characters are included in the latin1 charset, thus you don t get the problems related to non-latin1
    Message 1 of 7 , May 7, 2004
    • 0 Attachment
      Camillo Särs wrote:

      > > Those are latin1 characters, in case someone was wondering.
      >
      > Well, yes, part of the latin-1 repertoire, although I used utf-8 while
      > composing. Your mailer apparently couldn't cope with that correctly.
      > At least it sent the utf-8 characters as latin-1 bytes. Just to make
      > sure, here they are again, in iso-8859-1 this time: "åäö.txt".

      I know they were in utf-8, I just meant to say the characters are
      included in the latin1 charset, thus you don't get the problems related
      to non-latin1 characters.

      > > non-latin1 characters still have the problem that they appear as
      > > question marks in the title for me. Don't know how to solve that.
      >
      > There are a few possible reasons:
      > - You are using some code page dependent function, i.e. characters
      > within the code page display OK, but the rest are invalid and get '?'.

      The SetWindowTextW() function is used, that should work OK.

      > - You are using the correct function, feeding it Unicode characters, but the
      > font used to display the title bar does not contain glyphs for those
      > characters. In this case, however, I think the Unicode glyph for "not
      > available" should be shown.
      > Displaying Unicode characters is tricky, because most Windows fonts only
      > contain a certain subset.

      I tried changing the font, but that didn't solve the problem. I used
      the same font that displays the characters OK inside Vim.

      > >>If I list a directory using ":edit [path]", the listing displays
      > >>"<e5><e4><f6>.txt"
      > >
      > > With listing, do you mean using CTRL-D?
      >
      > I meant listing as in "the directory listing displayed by vim in a window
      > when I say :edit [directory name]". The file names apparently are in the
      > local code page, which means that the directory listing is retrieved using
      > ANSI functions. To display correctly when the encoding is utf-8, you should
      > get the listing using Wide functions. This would give you utf-16, which is
      > easy to convert to utf-8. :)

      The explorer plugin uses the glob() function, which in turn uses the
      same functions used for completion. Thus it's still the same problem.

      > > The completion apparently is not multi-byte aware. That editing still
      > > works is because illegal bytes are accepted.
      >
      > Which from my perspective is a very good design - the failure mode still
      > makes editing possible.

      Some people argue this is a security risk, but I have never understood
      why.

      --
      hundred-and-one symptoms of being an internet addict:
      161. You get up before the sun rises to check your e-mail, and you
      find yourself in the very same chair long after the sun has set.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
    • Glenn Maynard
      ... If you want to test wide characters, you need to use characters that aren t included in your ANSI codepage, eg. 日本語 . ... You have to create the a
      Message 2 of 7 , May 7, 2004
      • 0 Attachment
        On Thu, May 06, 2004 at 06:26:01PM +0200, Bram Moolenaar wrote:
        > > Apparently some, but not all, code paths that should use "Wide"
        > > versions of the Win32 API have now been converted from ANSI to Wide.
        > > There is an interesting discrepancy left, however. In the following
        > > I'm using "åäö.txt" as a filename.

        If you want to test wide characters, you need to use characters that
        aren't included in your ANSI codepage, eg. "日本語".

        > Those are latin1 characters, in case someone was wondering. non-latin1
        > characters still have the problem that they appear as question marks in
        > the title for me. Don't know how to solve that.

        You have to create the a wide window class.

        Instead of setting up WNDCLASS and calling RegisterClass, first set up
        and WNDCLASSW and call RegisterClassW.

        If it fails with ERROR_CALL_NOT_IMPLEMENTED, set up the regular WNDCLASS
        as usual (for Win9x).

        Finally, at the bottom of your WndProc, call DefWindowProcW instead of
        DefWindowProcA if RegisterClassW was used.

        This makes the window capable of displaying Unicode text in the titlebar;
        otherwise, even if you pass Unicode data in, the low-level internal stuff
        that actually draws the text will just print "?".

        That should be enough to make SetWindowTextW work. (Of course, a
        fallback on SetWindowTextA on ERROR_CALL_NOT_IMPLEMENTED is also needed.)

        It's not a lot of work, but as there are plenty of other places that don't
        use wide system calls, I didn't bother fixing it.

        --
        Glenn Maynard
      • Bram Moolenaar
        ... Great, that is the hint I needed. I ll try searching for a bit of example code, especialy for handling the errors. ... Generally using utf-8 for
        Message 3 of 7 , May 8, 2004
        • 0 Attachment
          Glenn Maynard wrote:

          > On Thu, May 06, 2004 at 06:26:01PM +0200, Bram Moolenaar wrote:
          > > > Apparently some, but not all, code paths that should use "Wide"
          > > > versions of the Win32 API have now been converted from ANSI to Wide.
          > > > There is an interesting discrepancy left, however. In the following
          > > > I'm using "åäö.txt" as a filename.
          >
          > If you want to test wide characters, you need to use characters that
          > aren't included in your ANSI codepage, eg. "日本語".
          >
          > > Those are latin1 characters, in case someone was wondering. non-latin1
          > > characters still have the problem that they appear as question marks in
          > > the title for me. Don't know how to solve that.
          >
          > You have to create the a wide window class.
          >
          > Instead of setting up WNDCLASS and calling RegisterClass, first set up
          > and WNDCLASSW and call RegisterClassW.
          >
          > If it fails with ERROR_CALL_NOT_IMPLEMENTED, set up the regular WNDCLASS
          > as usual (for Win9x).
          >
          > Finally, at the bottom of your WndProc, call DefWindowProcW instead of
          > DefWindowProcA if RegisterClassW was used.
          >
          > This makes the window capable of displaying Unicode text in the titlebar;
          > otherwise, even if you pass Unicode data in, the low-level internal stuff
          > that actually draws the text will just print "?".
          >
          > That should be enough to make SetWindowTextW work. (Of course, a
          > fallback on SetWindowTextA on ERROR_CALL_NOT_IMPLEMENTED is also needed.)

          Great, that is the hint I needed. I'll try searching for a bit of
          example code, especialy for handling the errors.

          > It's not a lot of work, but as there are plenty of other places that don't
          > use wide system calls, I didn't bother fixing it.

          Generally using utf-8 for 'encoding' is a good thing to do on the long
          term. I'm trying to remove all disadvantages, so that using utf-8 will
          become the generic solution to problems with encodings, on all systems
          in all environments.

          --
          hundred-and-one symptoms of being an internet addict:
          179. You wonder why your household garbage can doesn't have an
          "empty recycle bin" button.

          /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
          /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
          \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
          \\\ Buy at Amazon and help AIDS victims -- http://ICCF.nl/click1.html ///
        • Glenn Maynard
          ... http://zewt.org/~glenn/window.c is the code I wrote for Putty. (see lines 450, 2569.) It hasn t been integrated upstream, though, so it doesn t have wide
          Message 4 of 7 , May 8, 2004
          • 0 Attachment
            On Sat, May 08, 2004 at 01:45:13PM +0200, Bram Moolenaar wrote:
            > Great, that is the hint I needed. I'll try searching for a bit of
            > example code, especialy for handling the errors.

            http://zewt.org/~glenn/window.c

            is the code I wrote for Putty. (see lines 450, 2569.) It hasn't been
            integrated upstream, though, so it doesn't have wide testing.

            > Generally using utf-8 for 'encoding' is a good thing to do on the long
            > term. I'm trying to remove all disadvantages, so that using utf-8 will
            > become the generic solution to problems with encodings, on all systems
            > in all environments.

            I agree, of course--I've wanted UTF-8 to become the default internal encoding
            for Vim in Windows for a while.

            (In this case, this isn't really a disadvantage of UTF-8, though; ACP strings
            do work ...)

            --
            Glenn Maynard
          Your message has been successfully submitted and would be delivered to recipients shortly.