Loading ...
Sorry, an error occurred while loading the content.

Re: Real displayed width of a character

Expand Messages
  • Tony Mechelynck
    ... Fullwidth characters always occupy two screen columns. Sometimes an empty column can be added in the last screen column if a fullwidth character would
    Message 1 of 4 , Oct 24, 2008
    • 0 Attachment
      On 24/10/08 16:22, Jehan Pagès wrote:
      > Hi all,
      >
      > I have a question about "displayed width" (and not encoding length!) of
      > a character. How does vim "decide" the width of a character, in term of
      > number of columns? Does it use some function like "wcwidth" (POSIX
      > function)? Some home-made similar function?
      >
      > The reason I ask this is that some characters sometimes would be single
      > or double column depending on the used font. Moreover Unicode, as far as
      > I could read, does not explicitely give a prefered size for characters,
      > in the exception of some characters (mostly East-Asian), which are in
      > dedicated Unicode planes (full-width and half-width characters). This is
      > explained in this Technical Report for instance (the only paper from the
      > Unicode Consortium I found which was dealing about character width as
      > the main topic,elsewhere I could only find allusions, or small notes, as
      > though it was implicit)
      > http://unicode.org/reports/tr11/
      >
      > An extract from this:
      > "
      > Except for a few characters, which are explicitly called out as
      > fullwidth or halfwidth in the Unicode Standard, characters are not
      > duplicated based on distinction in width. Some characters, such as the
      > ideographs, are always wide; others are always narrow; and some can be
      > narrow or wide, depending on the context. The Unicode character property
      > East_Asian_Width provides a default classification of characters, which
      > an implementation can use to decide at runtime whether to treat a
      > character as narrow or wide.
      > "
      >
      > Even though it is focused on East-Asian characters, I could find some
      > other characters which have very different sizes in different fonts. For
      > instance I found a few fonts with '@' being double size compared to
      > "typical" western characters (A-Z 0-9, etc.). Also this true for the
      > European money character (euro: €), or even the Latin characters /œ /or
      > æ (used in French among other places). I would even say that this seems
      > logical as these characters are formed by including 2 characters in each
      > other... So being double size seems normal to me, isn't it?
      > Unfortunately a function like wcwidth considers it must be "one column
      > wide", and apparently the function used by vim too (being the same or
      > another). Then I must find a font which has these characters but the
      > same width than the rest (so mono or close). If I don't, the characters
      > are "cut" by vim.
      >
      > Would you have an idea about this? Couldn't vim be improved in such a
      > way it would consider the font really used? This seems complicated as
      > the font is defined in the Terminal Emulator, not in vim itself. And I
      > could not find yet if there is some possible to advertise the used font
      > in any terminal protocol (VT100 or else). But then what if there was an
      > option in vim where the user could explicitely tell "I am using this
      > font". So that when vim displays characters and then ask the terminal to
      > "jump" to this or that column, it can calculate the right place to go,
      > without cutting text?
      > Thanks.
      >
      > Jehan

      Fullwidth characters always occupy two screen columns. Sometimes an
      empty column can be added in the last screen column if a fullwidth
      character would otherwise start in it.

      Halfwidth characters always occupy one screen column, except the hard
      tab (U+0009 HORIZONTAL TAB) which occupies one or more columns depending
      on 'tabstop' 'list' and 'listchars'. Strictly speaking, the tab is a
      "control character" anyway.

      Ambiguous-width characters are treated as fullwidth or halfwidth
      depending on the setting of the global 'ambiwidth' option.

      See:
      :help 'ambiwidth'
      :help 'tabstop'
      :help 'list'
      :help 'listchars'


      Note also that proportional fonts (fonts where m is much wider than i or
      l, not to mention Arabic final sad vs. isolated alif) are ugly in GTK2
      versions of gvim and cannot be used in any other versions, or in Console
      Vim.


      Best regards,
      Tony.
      --
      Although we modern persons tend to take our electric lights, radios,
      mixers, etc., for granted, hundreds of years ago people did not have
      any of these things, which is just as well because there was no place
      to plug them in. Then along came the first Electrical Pioneer,
      Benjamin Franklin, who flew a kite in a lighting storm and received a
      serious electrical shock. This proved that lighting was powered by the
      same force as carpets, but it also damaged Franklin's brain so severely
      that he started speaking only in incomprehensible maxims, such as "A
      penny saved is a penny earned." Eventually he had to be given a job
      running the post office.
      -- Dave Barry, "What is Electricity?"

      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Mansing
      Wow! For ages, I knew not to ask this question. Now with ... my Chinese /open/ quotation mark ( “ code=0x201c ) is displayed correctly --without colliding
      Message 2 of 4 , Oct 24, 2008
      • 0 Attachment
        Wow!  For ages, I knew not to ask this question.  Now with
        :set ambiwidth=double
        my Chinese /open/ quotation mark ( “ code=0x201c ) is displayed correctly --without colliding with the next character.  Strange that, the /close/ quotation mark ( ” code=0x201d ) has always been displayed well regardless of the ambiwidth setting?!

        mt 081025


        Tony Mechelynck wrote:
        On 24/10/08 16:22, Jehan Pagès wrote:
          
        Hi all,
        
        I have a question about "displayed width" (and not encoding length!) of
        a character. How does vim "decide" the width of a character, in term of
        number of columns? . . .
        
        Jehan
            
        . . .
        
        Ambiguous-width characters are treated as fullwidth or halfwidth 
        depending on the setting of the global 'ambiwidth' option.
        
        . . .
        Tony.
          

        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_multibyte" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---

      • Tony Mechelynck
        ... Hm. Here these characters are displayed with the same (narrow) glyph as a plain double quote in Bitstream Vera Sans Mono, but with FZFangSong U+201C is a
        Message 3 of 4 , Oct 24, 2008
        • 0 Attachment
          On 25/10/08 01:40, Mansing wrote:
          > Wow! For ages, I knew not to ask this question. Now with
          >
          > :set ambiwidth=double
          >
          > my Chinese /open/ quotation mark ( “ code=0x201c ) is displayed
          > correctly --without colliding with the next character. Strange that,
          > the /close/ quotation mark ( ” code=0x201d ) has always been displayed
          > well regardless of the ambiwidth setting?!
          >
          > mt 081025

          Hm. Here these characters are displayed with the same (narrow) glyph as
          a plain double quote in Bitstream Vera Sans Mono, but with FZFangSong
          U+201C is a 66 quote occupying the right half of its wide glyph while
          U+201D is a 99 quote in the left half of _its_ wide glyph (well, maybe I
          should say right-top and left-top quarters), so that with
          ambiwidth=single U+201C is overprinted on the next character while it's
          only the blank right half of U+201D which is overprinted on _its_ follower.

          Best regards,
          Tony.
          --
          Meader's Law:
          Whatever happens to you, it will previously have happened to
          everyone you know, only more so.

          --~--~---------~--~----~------------~-------~--~----~
          You received this message from the "vim_multibyte" maillist.
          For more information, visit http://www.vim.org/maillist.php
          -~----------~----~----~----~------~----~------~--~---
        Your message has been successfully submitted and would be delivered to recipients shortly.