Loading ...
Sorry, an error occurred while loading the content.

Re: Combining diacritical marks display as separate character

Expand Messages
  • Tony Mechelynck
    ... I don t have any problems with recent gvim versions (currently 7.2.141 but it already worked last week) and GTK2 2.14.4-8.6.2 on openSUSE 11.1. -- Well, of
    Message 1 of 7 , Mar 12, 2009
    • 0 Attachment
      On 12/03/09 09:09, Sven Siegmund wrote:
      > Hello, I have just installed gVim 7.2 on Windows XP SP3 and have set
      > utf-8 as the default encoding and a good unicode monospace font
      > (DejaVu Sans Mono) as the guifont.
      >
      > gVim 7.2 has problems rendering combining diacritical marks on
      > characters for which there is no dedicated unicode codepoint
      > containing them with that diacritics. I can imagine why that is.
      >
      > When I try to type "n" and then the U+0302 combing circumflex "^" I
      > get "n^" displayed instead of "n̂" (n with a circumflex on it). I can
      > imagine why this happens: "n" with a combining "^" are technically two
      > characters, two unicode codepoints. Its just OpenType features and the
      > font renderer of the OS (in Windows it is Uniscribe) which don't let
      > them display adjacently but overlap them.
      >
      > gVim does not use Uniscribe for rendering the font displayed. It is
      > more low-level. It has very rigid rules to display a given number of
      > characters/code-points per line and sticks to it. Hence it is forced
      > to display "n" with combined "^" as two separate characters.
      >
      > But then I wonder how can you use gVim to write scripts where such
      > combining of unicode-codepoints or reordering of letters (like in the
      > devanagari script) or LRT-RTL changes happen. Is there a solution?
      >
      > Thanks for your answers.

      I don't have any problems with recent gvim versions (currently 7.2.141
      but it already worked last week) and GTK2 2.14.4-8.6.2 on openSUSE 11.1.
      -- Well, of course I can't reproduce your case exactly since I'm on
      Linux. I'm currently typing a Russian dictionary with lots of combining
      acute accents (U+0301), which Vim correctly displays over the preceding
      spacing Cyrillic vowel. However IIRC even when I was on W98 with Windows
      6.1 it could display combining characters correctly in Unicode, using a
      "Courier New" font -- that's when I started my frontpage
      http://users.skynet.be/antoine.mechelynck/ where you can see several
      scripts on a single page, one of them vocalized Arabic. Since then,
      Unicode rendering has gone progressively better, not worse, over the years.

      Let me try n + U+0302 ... yep, I get the correct overprint, in my
      default font, which happens to be "Bitstream Vera Sans Mono", very
      similar to DejaVu IIUC.

      Current versions of gvim can display (by default) two combining
      characters on any spacing character, which is usually enough for Arabic,
      even IIUC Coranic Arabic, but not always for fully cantillated Hebrew;
      or (by a nondefault 'maxcombine' setting) up to 6 combining characters
      over a single spacing character, which is usually more than you'd need.
      But (IIUC) only if 'encoding' is set to UTF-8. You can set this even if
      you don't tell Windows to use Unicode everywhere, provided that you set
      it near the top of your vimrc. See
      http://vim.wikia.com/wiki/Working_with_Unicode for details.

      I'm not sure Vim does devanagari.

      It can do Hebrew or Arabic but not with true bidi: what Vim does is give
      you the option of displaying any window in either all RTL or all LTR.
      You can even have the same file in split-windows, one of them LTR (with
      English OK but Arabic or Hebrew wrong) and the other RTL (with Hebrew
      and/or Arabic OK, including Arabic joining forms if 'arabicshape' is on
      which is the default, but English wrong).


      Which exact version and patchlevel of gvim are you using? You might want
      to copy the first handful of lines from the output of ":version" (until
      the line with "Features included (+) or not (-)") -- see ":help :redir"
      about how to capture that kind of output. Also, when you type

      :echo has('multi_byte')

      what answer do you get? If it's zero, you're in trouble.

      Also, what is your _full_ 'guifont' setting? If it ends in cANSI, I
      think you're in trouble -- cDEFAULT is usually better IMHO.


      Best regards,
      Tony.
      --
      "Seven years and six months!" Humpty Dumpty repeated
      thoughtfully. "An uncomfortable sort of age. Now if you'd asked MY
      advice, I'd have said `Leave off at seven' -- but it's too late now."
      "I never ask advice about growing," Alice said indignantly.
      "Too proud?" the other enquired.
      Alice felt even more indignant at this suggestion. "I mean,"
      she said, "that one can't help growing older."
      "ONE can't, perhaps," said Humpty Dumpty; "but TWO can. With
      proper assistance, you might have left off at seven."
      -- Lewis Carroll

      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Ron Aaron
      On Mar 12, 11:53 am, Tony Mechelynck ... I use it on Windows and Linux, and it works well on both. ... That is, in fact, what I
      Message 2 of 7 , Mar 12, 2009
      • 0 Attachment
        On Mar 12, 11:53 am, Tony Mechelynck <antoine.mechely...@...>
        wrote:
        > I don't have any problems with recent gvim versions (currently 7.2.141
        > but it already worked last week) and GTK2 2.14.4-8.6.2 on openSUSE 11.1.

        I use it on Windows and Linux, and it works well on both.

        > It can do Hebrew or Arabic but not with true bidi: what Vim does is give
        > you the option of displaying any window in either all RTL or all LTR.
        > You can even have the same file in split-windows, one of them LTR (with
        > English OK but Arabic or Hebrew wrong) and the other RTL (with Hebrew
        > and/or Arabic OK, including Arabic joining forms if 'arabicshape' is on
        > which is the default, but English wrong).

        That is, in fact, what I regularly do. I open a bilingual (English
        and Hebrew) file, split the window, and have one be LTR and the other
        RTL. Then I use XeLaTex to produce really nice output :)

        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_multibyte" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      • Sven Siegmund
        Hello, thanks for the details, On Thu, Mar 12, 2009 at 10:53 AM, Tony Mechelynck ... Yep, two combining marks are enough for me. ... VIM - Vi IMproved 7.2
        Message 3 of 7 , Mar 12, 2009
        • 0 Attachment
          Hello, thanks for the details,

          On Thu, Mar 12, 2009 at 10:53 AM, Tony Mechelynck
          <antoine.mechelynck@...> wrote:
          > Current versions of gvim can display (by default) two combining
          > characters on any spacing character, which is usually enough for Arabic,

          Yep, two combining marks are enough for me.

          > Which exact version and patchlevel of gvim are you using? You might want
          > to copy the first handful of lines from the output of ":version" (until
          > the line with "Features included (+) or not (-)") -- see ":help :redir"
          > about how to capture that kind of output. Also, when you type

          VIM - Vi IMproved 7.2 (2008 Aug 9, compiled Aug 9 2008 18:46:22)
          MS-Windows 32-bit GUI version with OLE support
          Compiled by Bram@KIBAALE
          Big version with GUI.

          >        :echo has('multi_byte')
          1

          > Also, what is your _full_ 'guifont' setting? If it ends in cANSI, I
          > think you're in trouble -- cDEFAULT is usually better IMHO.

          "unicode encoding:
          set enc=utf-8

          "set gui font
          set guifont=DejaVu_Sans_Mono:h11:cDEFAULT

          set nocompatible
          source $VIMRUNTIME/vimrc_example.vim
          ...
          ...
          ...

          I explored the problem further. There is something wrong with gvim
          interpreting deadkeys of the Windows-Keyboard layout. I could not type
          "n" with combined circumflex because I tried to map the combining
          circumflex on a dead key of my windows keyboard layout. When I map the
          combining circumflex to another key it works and it gets displayed
          well in gvim.

          I will explore the problems of remapping the dead keys of the windows
          keyboard layout later. So far I could not google anything about this
          issue in gvim in Windows.

          S.

          --~--~---------~--~----~------------~-------~--~----~
          You received this message from the "vim_multibyte" maillist.
          For more information, visit http://www.vim.org/maillist.php
          -~----------~----~----~----~------~----~------~--~---
        • Kenneth Reid Beesley
          ... I m using MacVim Snapshot 43, with DejaVu Sans Mono, and the handling of Unicode, including the rendering of letters with combining diacritical marks, is
          Message 4 of 7 , Mar 12, 2009
          • 0 Attachment
            On 12 Mar 2009, at 07:51, Sven Siegmund wrote:

            >
            > Hello, thanks for the details,
            >
            > On Thu, Mar 12, 2009 at 10:53 AM, Tony Mechelynck
            > <antoine.mechelynck@...> wrote:
            >> Current versions of gvim can display (by default) two combining
            >> characters on any spacing character, which is usually enough for
            >> Arabic,
            >
            > Yep, two combining marks are enough for me.
            >
            >> Which exact version and patchlevel of gvim are you using? You might
            >> want
            >> to copy the first handful of lines from the output of
            >> ":version" (until
            >> the line with "Features included (+) or not (-)") -- see
            >> ":help :redir"
            >> about how to capture that kind of output. Also, when you type
            >
            > VIM - Vi IMproved 7.2 (2008 Aug 9, compiled Aug 9 2008 18:46:22)
            > MS-Windows 32-bit GUI version with OLE support
            > Compiled by Bram@KIBAALE
            > Big version with GUI.
            >
            >> :echo has('multi_byte')
            > 1
            >
            >> Also, what is your _full_ 'guifont' setting? If it ends in cANSI, I
            >> think you're in trouble -- cDEFAULT is usually better IMHO.
            >
            > "unicode encoding:
            > set enc=utf-8
            >
            > "set gui font
            > set guifont=DejaVu_Sans_Mono:h11:cDEFAULT
            >
            > set nocompatible
            > source $VIMRUNTIME/vimrc_example.vim
            > ...
            > ...
            > ...
            >
            > I explored the problem further. There is something wrong with gvim
            > interpreting deadkeys of the Windows-Keyboard layout. I could not type
            > "n" with combined circumflex because I tried to map the combining
            > circumflex on a dead key of my windows keyboard layout. When I map the
            > combining circumflex to another key it works and it gets displayed
            > well in gvim.
            >
            > I will explore the problems of remapping the dead keys of the windows
            > keyboard layout later. So far I could not google anything about this
            > issue in gvim in Windows.
            >
            > S.
            >
            > >


            I'm using MacVim Snapshot 43, with DejaVu Sans Mono, and the handling
            of Unicode, including the rendering of letters with combining
            diacritical marks, is surprisingly good.

            n+0x0302

            displays perfectly for me, with a circumflex placed nicely above the
            'n'. I sometimes work with orthographies for Native American
            languages, which sometimes require two combining diacritics on the
            same letter, and MacVim again does well. This is one of the (several)
            reasons that I made the painful move from emacs to vim.

            Ken

            ******************************
            Kenneth R. Beesley, D.Phil.
            P.O. Box 540475
            North Salt Lake, UT
            84054 USA






            --~--~---------~--~----~------------~-------~--~----~
            You received this message from the "vim_multibyte" maillist.
            For more information, visit http://www.vim.org/maillist.php
            -~----------~----~----~----~------~----~------~--~---
          • Tony Mechelynck
            ... My pleasure. Beware: I m going to send this email in UTF-8 because of the text I ll be typing into it. ... [...] ... This means 7.2.0. I would recommend
            Message 5 of 7 , Mar 12, 2009
            • 0 Attachment
              On 12/03/09 14:51, Sven Siegmund wrote:
              > Hello, thanks for the details,

              My pleasure.

              Beware: I'm going to send this email in UTF-8 because of the text I'll
              be typing into it.

              >
              > On Thu, Mar 12, 2009 at 10:53 AM, Tony Mechelynck
              > <antoine.mechelynck@...> wrote:
              [...]
              >> Which exact version and patchlevel of gvim are you using? You might want
              >> to copy the first handful of lines from the output of ":version" (until
              >> the line with "Features included (+) or not (-)") -- see ":help :redir"
              >> about how to capture that kind of output. Also, when you type
              > VIM - Vi IMproved 7.2 (2008 Aug 9, compiled Aug 9 2008 18:46:22)
              > MS-Windows 32-bit GUI version with OLE support
              > Compiled by Bram@KIBAALE
              > Big version with GUI.

              This means 7.2.0. I would recommend that you install a more recent
              bugfixed versions, for instance (for Windows) one of Steve Hall's
              distributions at
              https://sourceforge.net/project/showfiles.php?group_id=43866&package_id=39721
              -- click the clipboard-like icon next to a download link to see when
              that build was compiled and what features are included.

              I'm not asying that a more recent build will necessarily cure _this_
              problem, but it is always worth doing, since it might cure _other_
              problems which you might be having. At
              http://ftp.vim.org/pub/vim/patches/7.2/README you can see a text file
              with a one-line description of every bugfix published sofar for Vim 7.2
              -- and whenever a new bugfix gets published, that README file is updated
              at the same time.

              >
              >> :echo has('multi_byte')
              > 1

              Good. Nonzero means "feature is present".

              >
              >> Also, what is your _full_ 'guifont' setting? If it ends in cANSI, I
              >> think you're in trouble -- cDEFAULT is usually better IMHO.
              > "unicode encoding:
              > set enc=utf-8
              >
              > "set gui font
              > set guifont=DejaVu_Sans_Mono:h11:cDEFAULT

              this ought to be all right.

              >
              > set nocompatible
              > source $VIMRUNTIME/vimrc_example.vim
              > ...
              > ...
              > ...
              >
              > I explored the problem further. There is something wrong with gvim
              > interpreting deadkeys of the Windows-Keyboard layout. I could not type
              > "n" with combined circumflex because I tried to map the combining
              > circumflex on a dead key of my windows keyboard layout. When I map the
              > combining circumflex to another key it works and it gets displayed
              > well in gvim.

              Aha! To enter any Unicode codepoint by its Unicode codepoint number in
              Vim, use the method described at |i_CTRL-V_digit|. Or if you frequently
              use some particular codepoints, you might want to use a keymap -- either
              a preexisting one if you find one that suits you, or else you can build
              your own: it isn't very hard once you get the hang of it. The
              "accents.vim" and "esperanto.vim" keymaps (in $VIMRUNTIME/keymap/) are
              small examples showing how keymaps are built. The relevant help is at
              |keymap-file-format|.

              -- Note that if you build your own keymap it should NOT go into
              $VIMRUNTIME/keymap/ (where any upgrade may silently destroy it) but into
              either $VIM/vimfiles/keymap/ (if you want to be able to access it from
              any Windows login name) or $HOME/vimfiles/keymap/ (to restrict it to one
              login name, since every "user" has a different $HOME directory). Create
              the needed directory, and maybe its parent too, if they don't yet exist.

              Of course Vim must see the keypress in order to act on it, and I suspect
              that Windows dead keas are retained by Windows (and not given to Vim)
              until you press something else (with which Windows, not Vim, will
              combine the "dead key"). And since "Unicode combining characters" must
              go _after_ the spacing character to which they apply, they are not
              really "dead keys" in the usual typewriter meaning of the expression: on
              my Belgian keyboard I hit "dead-circumflex" followed by c to get the
              _precombined_ Esperanto consonant ĉ (U+0109 LATIN SMALL LETTER C WITH
              CIRCUMFLEX) but in Vim I type c first and ^Vu0302 afterwards to get the
              _composite_ codepoints ĉ [i.e. c (U+0063 LATIN SMALL LETTER C) followed
              by "dead-circumflex" (U+0302 COMBINING CIRCUMFLEX ACCENT)] which
              SeaMonkey 2.0b1pre erroneously does not overprint in the mail
              composition window -- I don't know about your mailer.

              >
              > I will explore the problems of remapping the dead keys of the windows
              > keyboard layout later. So far I could not google anything about this
              > issue in gvim in Windows.
              >
              > S.

              As far as I know, everything, but _everything_ about Vim behaviour is
              in the help. (Obviously, the fine points of _Windows_ behaviour are not
              in the _Vim_ help.) To find your precious needle (any needle) in the Vim
              help^H^H^H^Hhaystack (which is admittedly a huge one), use the following
              starting points (magnets, if you will ;-) since sewing needles are
              usually made of steel):

              :help
              :help :help
              :help {subject}
              where {subject} means exactly open-brace, small-ess,
              small-you, small-bee, small-jay, small-eeh, small-cee,
              small-tee, close-brace. No fancy replacing (yet).
              :help :helpgrep

              which will explain progressively more complex methods of finding your
              way about the help.



              Best regards,
              Tony.
              --
              Mustgo, n.:
              Any item of food that has been sitting in the refrigerator so
              long it has become a science project.
              -- Sniglets, "Rich Hall & Friends"

              --~--~---------~--~----~------------~-------~--~----~
              You received this message from the "vim_multibyte" maillist.
              For more information, visit http://www.vim.org/maillist.php
              -~----------~----~----~----~------~----~------~--~---
            • Tony Mechelynck
              ... What I use to produce real nice true-bidi output is my browser -- SeaMonkey 2.0b1pre, but Firefox 3 (3.0 or 3.1 I m not sure) uses identically the same
              Message 6 of 7 , Mar 12, 2009
              • 0 Attachment
                On 12/03/09 11:56, Ron Aaron wrote:
                > On Mar 12, 11:53 am, Tony Mechelynck<antoine.mechely...@...>
                > wrote:
                >> I don't have any problems with recent gvim versions (currently 7.2.141
                >> but it already worked last week) and GTK2 2.14.4-8.6.2 on openSUSE 11.1.
                > I use it on Windows and Linux, and it works well on both.
                >
                >> It can do Hebrew or Arabic but not with true bidi: what Vim does is give
                >> you the option of displaying any window in either all RTL or all LTR.
                >> You can even have the same file in split-windows, one of them LTR (with
                >> English OK but Arabic or Hebrew wrong) and the other RTL (with Hebrew
                >> and/or Arabic OK, including Arabic joining forms if 'arabicshape' is on
                >> which is the default, but English wrong).
                > That is, in fact, what I regularly do. I open a bilingual (English
                > and Hebrew) file, split the window, and have one be LTR and the other
                > RTL. Then I use XeLaTex to produce really nice output :)

                What I use to produce real nice true-bidi output is my browser --
                SeaMonkey 2.0b1pre, but Firefox 3 (3.0 or 3.1 I'm not sure) uses
                identically the same rendering engine, and any "good" browser ought to
                do well, which is not to say all of them indeed do, for the kind of
                files which I use, namely HTML and plain text.


                Best regards,
                Tony.
                --
                There was a plane crash over mid-ocean, and only three survivors were
                left in the life-raft: the Pope, the President, and Mayor Daley.
                Unfortunately, it was a one-man life-raft, and quickly sinking, so they
                started debating who should be allowed to stay.

                The Pope pointed out that he was the spiritual leader of millions all
                over the world, the President explained that if he died then America
                would be stuck with the Vice-President, and so forth. Then Mayor Daley
                said, "Look! We're not solving anything like this! The only fair
                thing to do is to vote on it." So they did, and Mayor Daley won by 97
                votes.

                --~--~---------~--~----~------------~-------~--~----~
                You received this message from the "vim_multibyte" maillist.
                For more information, visit http://www.vim.org/maillist.php
                -~----------~----~----~----~------~----~------~--~---
              Your message has been successfully submitted and would be delivered to recipients shortly.