Loading ...
Sorry, an error occurred while loading the content.

Re: Cursor/rendering position bug handling unicode devanagari characters

Expand Messages
  • Paul.W Harvey
    Hi Bram, ... The fault persists with xterm. - Created the problem line of text using gedit, saved test.txt - from gnome-terminal, xterm -r -en UTF-8 -e vim
    Message 1 of 11 , Jul 3 5:43 PM
    • 0 Attachment
      Hi Bram,

      On 04/07/11 02:30, Bram Moolenaar wrote:
      > Please try this in an xterm, in utf-8 mode. If it's still wrong there
      > it might be a Vim bug. If it's OK in xterm it's probably a
      > gnome-terminal bug.

      The fault persists with xterm.

      - Created the problem line of text using gedit, saved test.txt
      - from gnome-terminal, xterm -r -en UTF-8 -e vim test.txt
      - Observed the previously reported behaviour in vim
      - from gnome-terminal, xterm -r -en UTF-8 -e nano a.txt
      - Observed proper behaviour in nano

      It's worth noting that the Kanji characters seem fine.

      I'm ignorant of writing systems in general, especially devanagari
      script, but I seems that vim is incorrectly rendering modifier/diacratic
      marks as separate chars instead of combining them into the
      modified/marked character.

      This would explain why the cursor falls short of the last few chars
      rendered when trying to go to end of line.

      Cheers

      --
      Paul.W.Harvey@..., Ph: (02) 6246 5105
      Informatics Technologist - www.taxonomy.org.au
      Centre for Australian National Biodiversity Research

      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php
    • Paul
      Hi Benjamin, On Jul 4, 6:51 am, Benjamin R. Haskell wrote: ... So, this is a known issue. Thank you for these detailed answers. I wonder
      Message 2 of 11 , Jul 3 6:23 PM
      • 0 Attachment
        Hi Benjamin,

        On Jul 4, 6:51 am, "Benjamin R. Haskell" <v...@...> wrote:
        ... snip
        > I originally wrote this¹ up (on the web since email's the wrong place
        > for HTML and images) in response to a post «Complex Scripts in
        > Vim/gVim»², but figured it was appropriate here, since Devanāgarī is the
        > topic again.  The summary is that you're probably not going to be happy
        > with the way vim (g- or terminal) displays Devanāgarī at this point in
        > time, especially if you really have to read/write it (instead of just
        > viewing it as a sequence of characters for Unicode testing).

        So, this is a known issue. Thank you for these detailed answers.

        I wonder how nano is working, though - because even if the script
        isn't rendered properly, at least its cursor-character/rendering seems
        to track properly, whereas in vim I do not feel confident editing any
        lines containing Devanāgarī.

        Anyway, I've been using vim for 8 years now and this is the first bug
        that's ever affected me, so I appreciate the wonderful work you all
        do.

        Cheers

        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php
      • Benjamin R. Haskell
        ... Is nano consistent for you under gnome-terminal? Did you try the sample from the previous thread¹? Neither it nor your pastebin sample² works for me in
        Message 3 of 11 , Jul 3 8:33 PM
        • 0 Attachment
          On Sun, 3 Jul 2011, Paul wrote:

          > Hi Benjamin,
          >
          > On Jul 4, 6:51 am, "Benjamin R. Haskell" wrote:
          > ... snip
          >> I originally wrote this¹ up (on the web since email's the wrong place
          >> for HTML and images) in response to a post «Complex Scripts in
          >> Vim/gVim»², but figured it was appropriate here, since Devanāgarī is
          >> the topic again.  The summary is that you're probably not going to be
          >> happy with the way vim (g- or terminal) displays Devanāgarī at this
          >> point in time, especially if you really have to read/write it
          >> (instead of just viewing it as a sequence of characters for Unicode
          >> testing).
          >
          > So, this is a known issue. Thank you for these detailed answers.
          >
          > I wonder how nano is working, though - because even if the script
          > isn't rendered properly, at least its cursor-character/rendering seems
          > to track properly, whereas in vim I do not feel confident editing any
          > lines containing Devanāgarī.

          Is nano consistent for you under gnome-terminal? Did you try the sample
          from the previous thread¹?

          Neither it nor your pastebin sample² works for me in nano under
          gnome-terminal. For your sample, I get the "three extraneous circle
          things" (three combining characters that couldn't be rendered properly)
          in both nano and vim: apparently Unicode code points U+093e (Devanagari
          Vowel Sign AA), U+093f (Devanagari Vowel Sign I), and U+0940 (Devanagari
          Vowel Sign II).

          It might be useful to know what fonts you've got installed that are
          covering these characters. A quick first pass:

          fc-list :lang=hi | sort

          --
          Best,
          Ben

          ¹: http://benizi.com/vim/devanagari/snippet.txt
          ²: http://pastebin.com/DMvM7Fx9

          --
          You received this message from the "vim_dev" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php
        • Paul.W Harvey
          ... I ve attached a screenshot (also at http://imagebin.org/161344) showing that both editors actually corrupt the script in the same way visually, as you have
          Message 4 of 11 , Jul 4 12:42 AM
          • 0 Attachment
            On 04/07/11 23:33:16 -0400 (EDT), Benjamin R. Haskell wrote:
            > Is nano consistent for you under gnome-terminal? Did you try the sample
            > from the previous thread¹?

            I've attached a screenshot (also at http://imagebin.org/161344) showing
            that both editors actually corrupt the script in the same way visually,
            as you have observed.

            gedit displays fine.

            The difference with nano is that unlike vim, I can reliably position my
            cursor within the text and the movement, user input is rendered in an
            unsurprising way.

            Of course, the script is still rendered wrong in nano, I'm just talking
            about the mis-match between my user input and the rendered result.

            The simplest consistency test is just seeing where each editor thinks
            the "end of line" is. The screenshot shows nano's cursor at the end of
            the rendered line. Vim's idea of where the end of line is, falls short
            of the rendered text.

            It's almost as if vim's behaviour indicates its "internal" understanding
            of the character grid (if that's what you call it) is correct (the
            cursor seems to move over the correct number of chars), but some of
            those chars are taking up more than one cell each when rendered.

            ~$ fc-list :lang=hi | sort
            FreeSans:style=Medium,obyčejné,Mittel,µεσαία,Normal,Medio,Gemiddeld,odmiana
            zwykła,Обычный,navadno,Vừa
            gargi:style=Medium
            Lohit Hindi:style=Regular

            - Paul

            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php
          • Bram Moolenaar
            ... I have no idea why the modifier chars aren t handled correctly, is there something special about them? One last thing to check is the ambiwidth option.
            Message 5 of 11 , Jul 4 5:28 AM
            • 0 Attachment
              Paul W. Harvey wrote:

              > On 04/07/11 02:30, Bram Moolenaar wrote:
              > > Please try this in an xterm, in utf-8 mode. If it's still wrong there
              > > it might be a Vim bug. If it's OK in xterm it's probably a
              > > gnome-terminal bug.
              >
              > The fault persists with xterm.
              >
              > - Created the problem line of text using gedit, saved test.txt
              > - from gnome-terminal, xterm -r -en UTF-8 -e vim test.txt
              > - Observed the previously reported behaviour in vim
              > - from gnome-terminal, xterm -r -en UTF-8 -e nano a.txt
              > - Observed proper behaviour in nano
              >
              > It's worth noting that the Kanji characters seem fine.
              >
              > I'm ignorant of writing systems in general, especially devanagari
              > script, but I seems that vim is incorrectly rendering modifier/diacratic
              > marks as separate chars instead of combining them into the
              > modified/marked character.
              >
              > This would explain why the cursor falls short of the last few chars
              > rendered when trying to go to end of line.

              I have no idea why the modifier chars aren't handled correctly, is there
              something special about them?

              One last thing to check is the 'ambiwidth' option. That's an annoying
              Unicode feature, requiring a manual setting.

              --
              hundred-and-one symptoms of being an internet addict:
              245. You use Real Audio to listen to a radio station from a distant
              city rather than turn on your stereo system.

              /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
              /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
              \\\ an exciting new programming language -- http://www.Zimbu.org ///
              \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

              --
              You received this message from the "vim_dev" maillist.
              Do not top-post! Type your reply below the text you are replying to.
              For more information, visit http://www.vim.org/maillist.php
            • Paul.W Harvey
              ... Okay, I ve found some fascinating doc on this subject: http://sites.google.com/site/ocropus/old-documentation/documentation/indic-scripts हि is indeed
              Message 6 of 11 , Jul 4 6:23 PM
              • 0 Attachment
                On 04/07/11 22:28, Bram Moolenaar wrote:
                > I have no idea why the modifier chars aren't handled correctly, is there
                > something special about them?

                Okay, I've found some fascinating doc on this subject:
                http://sites.google.com/site/ocropus/old-documentation/documentation/indic-scripts

                हि is indeed two separate unicode chars, they are:
                * 'ह' devanagari letter ha http://graphemica.com/0939
                * 'ि' devanagari vowel sign i http://graphemica.com/093F

                > One last thing to check is the 'ambiwidth' option. That's an annoying
                > Unicode feature, requiring a manual setting.

                I tried settings of double/single, but it still displayed 'ह' as two
                separate characters rather than one.

                Probably it's "too hard" to make vim render this sequence as a single
                character (unless you guys have developers that use and care about this
                script); but perhaps it might be possible to get reliable cursor
                behaviour inside devanagari strings

                - PH

                --
                You received this message from the "vim_dev" maillist.
                Do not top-post! Type your reply below the text you are replying to.
                For more information, visit http://www.vim.org/maillist.php
              • Charles Campbell
                ... Have you tried this with gvim? Its still rendered improperly (but consistently!); however, using $ sends the cursor to the semicolon at the end of the
                Message 7 of 11 , Jul 6 1:43 PM
                • 0 Attachment
                  > On 04/07/11 23:33:16 -0400 (EDT), Benjamin R. Haskell wrote:
                  > > Is nano consistent for you under gnome-terminal? Did you try the
                  > sample
                  > > from the previous thread¹?
                  >
                  > I've attached a screenshot (also at http://imagebin.org/161344)
                  > showing that both editors actually corrupt the script in the same way
                  > visually, as you have observed.
                  >
                  > gedit displays fine.
                  >
                  > The difference with nano is that unlike vim, I can reliably position
                  > my cursor within the text and the movement, user input is rendered in
                  > an unsurprising way.
                  >
                  > Of course, the script is still rendered wrong in nano, I'm just
                  > talking about the mis-match between my user input and the rendered
                  > result.
                  >
                  > The simplest consistency test is just seeing where each editor thinks
                  > the "end of line" is. The screenshot shows nano's cursor at the end of
                  > the rendered line. Vim's idea of where the end of line is, falls short
                  > of the rendered text.
                  >
                  > It's almost as if vim's behaviour indicates its "internal"
                  > understanding of the character grid (if that's what you call it) is
                  > correct (the cursor seems to move over the correct number of chars),
                  > but some of those chars are taking up more than one cell each when
                  > rendered.
                  >
                  > ~$ fc-list :lang=hi | sort
                  > FreeSans:style=Medium,obyčejné,Mittel,µεσαία,Normal,Medio,Gemiddeld,odmiana
                  > zwykła,Обычный,navadno,Vừa
                  > gargi:style=Medium
                  > Lohit Hindi:style=Regular
                  Have you tried this with gvim? Its still rendered improperly (but
                  consistently!); however, using "$" sends the cursor to the semicolon at
                  the end of the line. I do see that the cursor does not go to the end of
                  line with vim running under either gnome-terminal or xterm.

                  Regards,
                  Chip Campbell

                  --
                  You received this message from the "vim_dev" maillist.
                  Do not top-post! Type your reply below the text you are replying to.
                  For more information, visit http://www.vim.org/maillist.php
                Your message has been successfully submitted and would be delivered to recipients shortly.