Loading ...
Sorry, an error occurred while loading the content.

Re: Multibyte bugs

Expand Messages
  • Tony Mechelynck
    On 11/04/10 16:33, Bram Moolenaar wrote: [...] ... [...] My guess is that when that RFC was drafted in 1992, some of the charsets they wanted to list used a
    Message 1 of 8 , Apr 11, 2010
    • 0 Attachment
      On 11/04/10 16:33, Bram Moolenaar wrote:
      [...]
      > It's weird that digraphs are defined for an area that doesn't have
      > characters assigned to it. I wonder what happened here. Perhaps this
      > changed at some point in time? If we know the reason we may want to
      > drop all the dibgraphs for 0xexxx.
      [...]

      My guess is that when that RFC was drafted in 1992, some of the charsets
      they wanted to list used a few characters which, at that time, weren't
      clearly assigned to one Unicode codepoint, and that the RFC authors
      arbitrarily (and maybe temporarily) placed these characters in a
      "private use area", which is the only place where "characters not yet
      assigned a Unicode codepoint" may go. This is only a guess, however. I'm
      not sure how many people are reading this (extremely low-volume) ML, but
      maybe someone knows the history of those mnemonics from RFC 1345 better
      than you and I do? If someone with that knowledge is reading this,
      please speak up.

      IMHO it makes no sense to have digraphs in Vim for "private use"
      characters. I propose to drop any of them that cannot be usefully
      reassigned to some "official" Unicode codepoint elsewhere. E000 to E028
      means forty-one codepoints, it ought not to be a big problem.


      Best regards,
      Tony.
      --
      LAUNCELOT: At last! A call! A cry of distress ...
      (he draws his sword, and turns to CONCORDE)
      Concorde! Brave, Concorde ... you shall not have died in vain!
      CONCORDE: I'm not quite dead, sir ...
      "Monty Python and the Holy Grail" PYTHON (MONTY)
      PICTURES LTD

      --
      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php

      To unsubscribe, reply using "remove me" as the subject.
    • Bram Moolenaar
      ... Searching revealed a few proposals for these character ranges. And this page has a confusing summary:
      Message 2 of 8 , Apr 11, 2010
      • 0 Attachment
        Tony Mechelynck wrote:

        > On 11/04/10 16:33, Bram Moolenaar wrote:
        > [...]
        > > It's weird that digraphs are defined for an area that doesn't have
        > > characters assigned to it. I wonder what happened here. Perhaps this
        > > changed at some point in time? If we know the reason we may want to
        > > drop all the dibgraphs for 0xexxx.
        > [...]
        >
        > My guess is that when that RFC was drafted in 1992, some of the charsets
        > they wanted to list used a few characters which, at that time, weren't
        > clearly assigned to one Unicode codepoint, and that the RFC authors
        > arbitrarily (and maybe temporarily) placed these characters in a
        > "private use area", which is the only place where "characters not yet
        > assigned a Unicode codepoint" may go. This is only a guess, however. I'm
        > not sure how many people are reading this (extremely low-volume) ML, but
        > maybe someone knows the history of those mnemonics from RFC 1345 better
        > than you and I do? If someone with that knowledge is reading this,
        > please speak up.
        >
        > IMHO it makes no sense to have digraphs in Vim for "private use"
        > characters. I propose to drop any of them that cannot be usefully
        > reassigned to some "official" Unicode codepoint elsewhere. E000 to E028
        > means forty-one codepoints, it ought not to be a big problem.

        Searching revealed a few proposals for these character ranges. And
        this page has a confusing summary:
        http://en.wikibooks.org/wiki/Unicode/Character_reference/E000-EFFF
        "private use" but it does have a table with characters.

        Let's remove these digraphs. I can't imagine anyone is using them.

        --
        Clothes make the man. Naked people have little or no influence on society.
        -- Mark Twain (Samuel Clemens) (1835-1910)

        /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
        /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
        \\\ download, build and distribute -- http://www.A-A-P.org ///
        \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

        --
        You received this message from the "vim_multibyte" maillist.
        For more information, visit http://www.vim.org/maillist.php

        To unsubscribe, reply using "remove me" as the subject.
      • Tony Mechelynck
        ... [...] ... Yes; in my browser and with my usual font most (but not all) of them are CJK fullwidth ideograms and full-width counterparts of halfwidth math
        Message 3 of 8 , Apr 12, 2010
        • 0 Attachment
          On 11/04/10 17:33, Bram Moolenaar wrote:
          >
          > Tony Mechelynck wrote:
          [...]
          >> IMHO it makes no sense to have digraphs in Vim for "private use"
          >> characters. I propose to drop any of them that cannot be usefully
          >> reassigned to some "official" Unicode codepoint elsewhere. E000 to E028
          >> means forty-one codepoints, it ought not to be a big problem.
          >
          > Searching revealed a few proposals for these character ranges. And
          > this page has a confusing summary:
          > http://en.wikibooks.org/wiki/Unicode/Character_reference/E000-EFFF
          > "private use" but it does have a table with characters.

          Yes; in my browser and with my usual font most (but not all) of them are
          CJK fullwidth ideograms and full-width counterparts of halfwidth math
          symbols etc. A few are (halfwidth) Latin accented letters which even
          exist in Latin1 i.e. below U+0100 !!! For instance (in my browser)
          U+E023 to U+E081 look like duplicates of ASCII 0x21 to 0x7E in the same
          order. Note however the last sentence immediately before the table:

          «The repertoire seen with your computer's font will most likely not be
          the same as with other computers or fonts.»

          And indeed I see a different glyph for those codepoints in gvim with my
          usual 'guifont', which is not the same as my browser's usual serif and
          sans-serif fonts.

          >
          > Let's remove these digraphs. I can't imagine anyone is using them.
          >

          Neither can I.


          Best regards,
          Tony.
          --
          LAUNCELOT leaps into SHOT with a mighty cry and runs the GUARD
          through and
          hacks him to the floor. Blood. Swashbuckling music (perhaps).
          LAUNCELOT races through into the castle screaming.
          SECOND SENTRY: Hey!
          "Monty Python and the Holy Grail" PYTHON (MONTY)
          PICTURES LTD

          --
          You received this message from the "vim_multibyte" maillist.
          For more information, visit http://www.vim.org/maillist.php
        Your message has been successfully submitted and would be delivered to recipients shortly.