Loading ...
Sorry, an error occurred while loading the content.

Inputting the newer unicode characters

Expand Messages
  • Eze
    Greetings! Does anybody know how to input, for instance, U+1D434 in vim (a math uppercase A )? By using the sequence CTRL-V u 1D434 vim understandably only
    Message 1 of 13 , Aug 30, 2007
    • 0 Attachment
      Greetings!

      Does anybody know how to input, for instance, U+1D434 in vim (a "math"
      uppercase "A")? By using the sequence CTRL-V u 1D434 vim
      understandably only reads 1D43, getting "ᵃ4".

      Many thanks in advance.

      Cheers,

      Eze


      --~--~---------~--~----~------------~-------~--~----~
      You received this message from the "vim_multibyte" maillist.
      For more information, visit http://www.vim.org/maillist.php
      -~----------~----~----~----~------~----~------~--~---
    • Kenneth Beesley
      Eze, I m a vim beginner, but I m very interested in vim and Supplementary Unicode chars. Corrections/updates from experts would be very welcome. Really.
      Message 2 of 13 , Aug 30, 2007
      • 0 Attachment
        Eze,

        I'm a vim beginner, but I'm very interested in vim and Supplementary
        Unicode chars.
        Corrections/updates from experts would be very welcome. Really.

        Here's my amateur understanding:

        1. To do brute-force Hex entry, using hex code-point values, you've
        got two variants:

        Ctrl-V uHHHH for Unicode characters in the Basic
        Multilingual Plane
        Ctrl-V UHHHHHHHH for Unicode characters in the Supplementary
        Area

        (note the uppercase 'U', followed by 8 hex digits) so for your
        example, try

        Ctrl-V U0001D434

        I seem to recall that this longer U-based option for Supplementary
        chars was not well documented,
        or the documentation was not easy to find.

        2. That (I believe) will put the Math Uppercase A into the buffer,
        and you can write the buffer out to
        file (e.g. in UTF-8) successfully, that doesn't mean
        that vim can display/render it. The last I heard (months ago) was
        that vim was generally unable
        to render Supplementary chars, even if you specify a font that
        contains the glyphs you need in the
        supplementary area. The situation on Linux may be different.
        Updates/corrections from vim experts would be welcome.

        3. If brute-force char entry using the Ctrl-V trick becomes tedious,
        you could
        define a vim 'keymap' for your purposes. A keymap, when active,
        intercepts an input keystroke
        (or a sequence of input keystrokes) and maps them to an output
        Unicode char
        (or a sequence of Unicode chars) to be placed in the buffer. E.g.
        you might define
        a new keymap containing the line

        A <Char-0x1D434>

        (each line in a keymap has a sequence of input chars/keystrokes,
        white space, sequence of output chars)
        or perhaps

        <C-A> <Char-0x1D434>

        where <C-x> means Ctrl-x, <A-x> means Alt-x, <S-x> means Shift-x, not
        normally needed for
        alphabetic inputs, where you can just write X instead of <S-x>.

        I seem to recall that the format for vim keymaps is well documented,
        except for the
        <A-x>, <C-x> and <S-x> notations.

        Please keep us informed.

        Thanks,

        Ken



        On 30 Aug 2007, at 11:25, Eze wrote:

        >
        > Greetings!
        >
        > Does anybody know how to input, for instance, U+1D434 in vim (a "math"
        > uppercase "A")? By using the sequence CTRL-V u 1D434 vim
        > understandably only reads 1D43, getting "ᵃ4".
        >
        > Many thanks in advance.
        >
        > Cheers,
        >
        > Eze
        >
        >
        > >


        --~--~---------~--~----~------------~-------~--~----~
        You received this message from the "vim_multibyte" maillist.
        For more information, visit http://www.vim.org/maillist.php
        -~----------~----~----~----~------~----~------~--~---
      • Eze
        Thanks a lot, Ken! I have learned quite a lot from your post: your understanding goes far beyond mine. I m very interested to see how soon unicode will be
        Message 3 of 13 , Aug 30, 2007
        • 0 Attachment
          Thanks a lot, Ken! I have learned quite a lot from your post: your
          understanding goes far beyond mine. I'm very interested to see how
          soon unicode will be fully supported, if ever. It seems reasonable,
          XML and metadata aside, but the rendering issues seem to be more
          complicated than one would like.

          By the way, if you're a beginner, I'm still in kindergarten!

          Best,

          Eze


          --~--~---------~--~----~------------~-------~--~----~
          You received this message from the "vim_multibyte" maillist.
          For more information, visit http://www.vim.org/maillist.php
          -~----------~----~----~----~------~----~------~--~---
        • Kenneth Beesley
          Eze, Well, I m glad it was helpful. I d like to switch over to vim, but I work a lot with exotic Unicode characters in the supplementary area. When I last
          Message 4 of 13 , Aug 30, 2007
          • 0 Attachment
            Eze,

            Well, I'm glad it was helpful.

            I'd like to switch over to vim, but I work a lot with exotic Unicode
            characters
            in the supplementary area. When I last looked into vim, and
            experimented
            with keymaps, I found that I could easily enter any Unicode char, and
            save
            the results to file---and the Unicode chars in the file were
            correct. But as
            long as I couldn't _see_ my character glyphs rendered on the screen,
            vim wasn't
            acceptable as an editor. All I could see were boxes (or question
            marks--I can't
            remember which).

            It shouldn't be too hard for vim to display such characters, at least
            for
            straightforward alphabetic scripts like Shavian and Deseret (which
            are in the
            Supplementary space), as long as you
            specify a suitable font, but somewhere down inside the system there is
            (or was) a limitation that allows vim to display glyphs only for
            characters in
            the Basic Multilingual Plane. There are relatively few of us who
            care about
            Unicode, let alone Supplementary characters, so fixing this problem is
            probably not high priority.

            Another problem (for me) with vim is that it needs a monowidth font.
            On Linux,
            I think vim can use any font, but it still works best with a
            monowidth font.
            In today's computing, that's an awkward restriction.

            Please keep me informed if you find that there has been progress, or
            if I
            am plain wrong on any of these issues.

            Ken


            On 30 Aug 2007, at 15:24, Eze wrote:

            >
            > Thanks a lot, Ken! I have learned quite a lot from your post: your
            > understanding goes far beyond mine. I'm very interested to see how
            > soon unicode will be fully supported, if ever. It seems reasonable,
            > XML and metadata aside, but the rendering issues seem to be more
            > complicated than one would like.
            >
            > By the way, if you're a beginner, I'm still in kindergarten!
            >
            > Best,
            >
            > Eze
            >
            >
            > >


            --~--~---------~--~----~------------~-------~--~----~
            You received this message from the "vim_multibyte" maillist.
            For more information, visit http://www.vim.org/maillist.php
            -~----------~----~----~----~------~----~------~--~---
          • Tony Mechelynck
            ... See :help i_CTRL-V_digit Ctrl-V u is for Unicode codepoints in the BMP (i.e., U+0000 to U+FFFF). After that, use an uppercase U (i.e., shift-u): Ctrl-V U
            Message 5 of 13 , Aug 30, 2007
            • 0 Attachment
              Eze wrote:
              > Greetings!
              >
              > Does anybody know how to input, for instance, U+1D434 in vim (a "math"
              > uppercase "A")? By using the sequence CTRL-V u 1D434 vim
              > understandably only reads 1D43, getting "ᵃ4".
              >
              > Many thanks in advance.
              >
              > Cheers,
              >
              > Eze


              See ":help i_CTRL-V_digit"

              Ctrl-V u is for Unicode codepoints in the BMP (i.e., U+0000 to U+FFFF). After
              that, use an uppercase U (i.e., shift-u):

              Ctrl-V U 0001D434

              (the initial zeros may be omitted if the sequence is followed by a keypress
              other than [0-9A-Fa-f].)

              A limitation of current versions of Vim is that codepoints above U+FFFF are
              displayed as question marks (with the proper width). The data is entered
              correctly into the file, and ga and g8 show the correct values. Someone
              (Edward L. Fox IIRC) said he'd look into it but I haven't heard from him about
              it recently.



              Best regards,
              Tony.
              --
              Reality is just a convenient measure of complexity.
              -- Alvy Ray Smith

              --~--~---------~--~----~------------~-------~--~----~
              You received this message from the "vim_multibyte" maillist.
              For more information, visit http://www.vim.org/maillist.php
              -~----------~----~----~----~------~----~------~--~---
            • Tony Mechelynck
              Kenneth Beesley wrote: [...] ... [...] It s the same on Linux: AFAIK, that s a platform-independent limitation of current (and past) versions of gvim. IIRC,
              Message 6 of 13 , Aug 30, 2007
              • 0 Attachment
                Kenneth Beesley wrote:
                [...]
                > 2. That (I believe) will put the Math Uppercase A into the buffer,
                > and you can write the buffer out to
                > file (e.g. in UTF-8) successfully, that doesn't mean
                > that vim can display/render it. The last I heard (months ago) was
                > that vim was generally unable
                > to render Supplementary chars, even if you specify a font that
                > contains the glyphs you need in the
                > supplementary area. The situation on Linux may be different.
                > Updates/corrections from vim experts would be welcome.
                [...]

                It's the same on Linux: AFAIK, that's a platform-independent limitation of
                current (and past) versions of gvim. IIRC, Edward L. Fox had said he'd look
                into it but I don't know how far he got.


                Best regards,
                Tony.
                --
                A wanton young lady from Wimley
                Reproached for not acting quite primly
                Said, "Heavens above!
                I know sex isn't love,
                But it's such an entrancing facsimile."

                --~--~---------~--~----~------------~-------~--~----~
                You received this message from the "vim_multibyte" maillist.
                For more information, visit http://www.vim.org/maillist.php
                -~----------~----~----~----~------~----~------~--~---
              • Eze
                Thanks a lot to you both for your insights. Ken, if I may ask, what exactly do you use to see/work with/input unicode characters? All information will be
                Message 7 of 13 , Aug 31, 2007
                • 0 Attachment
                  Thanks a lot to you both for your insights. Ken, if I may ask, what
                  exactly do you use to see/work with/input unicode characters? All
                  information will be appreciated, such as linux distribution, desktop
                  manager, text editor, etcetera.

                  Best regards,

                  Eze


                  --~--~---------~--~----~------------~-------~--~----~
                  You received this message from the "vim_multibyte" maillist.
                  For more information, visit http://www.vim.org/maillist.php
                  -~----------~----~----~----~------~----~------~--~---
                • Tony Mechelynck
                  ... I don t know what Ken does, but I use gvim to input any unicode codepoints, and any browser (Firefox, SeaMonkey, or, depending on platform, Konqueror, IE,
                  Message 8 of 13 , Aug 31, 2007
                  • 0 Attachment
                    Eze wrote:
                    > Thanks a lot to you both for your insights. Ken, if I may ask, what
                    > exactly do you use to see/work with/input unicode characters? All
                    > information will be appreciated, such as linux distribution, desktop
                    > manager, text editor, etcetera.
                    >
                    > Best regards,
                    >
                    > Eze

                    I don't know what Ken does, but I use gvim to input any unicode codepoints,
                    and any browser (Firefox, SeaMonkey, or, depending on platform, Konqueror, IE,
                    Safari, etc.) to visualise those outside the BMP.


                    Best regards,
                    Tony.
                    --
                    Boob's Law:
                    You always find something in the last place you look.

                    --~--~---------~--~----~------------~-------~--~----~
                    You received this message from the "vim_multibyte" maillist.
                    For more information, visit http://www.vim.org/maillist.php
                    -~----------~----~----~----~------~----~------~--~---
                  • Kenneth Beesley
                    Hi Eze, For my Unicode editing needs, I try to survey the field once or twice a year. It s been a while since I last looked, so my information is probably out
                    Message 9 of 13 , Sep 4, 2007
                    • 0 Attachment
                      Hi Eze,

                      For my Unicode editing needs, I try to survey the field once or twice
                      a year.
                      It's been a while since I last looked, so my information is probably
                      out of date.
                      I can't keep up with all the Unicode-editing options.

                      My Unicode-editing needs are somewhat unusual. I occasionally need to
                      type Arabic script, and I definitely need Unicode combining
                      diacritics and
                      supplementary characters. I insist on being able to write my own
                      input methods,
                      and I'd like a solution that works in OS X, Linux and perhaps even
                      Windows.
                      I haven't found a perfect solution yet for my needs.

                      On the Mac, which I use most often, TextEdit (supplied with OS X) does a
                      much better than average job of _rendering_ the Unicode characters
                      that you
                      type. It has a built-in set of default fonts that so far have
                      rendered almost
                      anything that I've wanted to type, including Shavian and Deseret.
                      Combining
                      diacritics are (to the extent that I've tested them) handled
                      acceptably, even
                      rather well--this is a weak point in many other allegedly Unicode-
                      savvy editors.
                      However, if you are used to a full-featured text editor like vim
                      or emacs, then TextEdit hardly seems like a text editor at all. Too
                      limited in
                      commands and overall functionality. I'm glad that TextEdit is
                      available,
                      but I use it reluctantly.

                      TextEdit can use Apple Input Methods, many of which are supplied, and
                      you can (with some difficulty) define your own so that you can type in
                      Arabic, Cyrillic, Greek, Shavian, Deseret or whatever using your own
                      favorite
                      keyboard mapping or input method. I'm a firm believer that you ought to
                      be able to define your own personal input methods (or keyboard-layout
                      emulations)
                      so that you can do it Your Way, even if dozens of input methods
                      are already available. There are (or were) some bugs in the
                      interpretation of Apple
                      Input Methods, and fixing them seems to be very low priority at
                      Apple. I need
                      to recheck the status.

                      I need to take another look at the commercial text editors available
                      for OS X.

                      I also work a lot with Unicode in XML, and I have purchased a license
                      for the oXygen
                      XML editor. oXygen is Java-based and so can use Java Input Methods,
                      which
                      are much better documented and easier to define than Apple Input
                      Methods.
                      oXygen can also be used to edit plain-text Unicode files. It renders
                      Unicode
                      to the extent that Java Swing text widgets render Unicode, which is
                      pretty
                      well. Installing new Unicode TrueType/OpenType fonts inside your Java
                      installation, to allow the rendering of exotic characters, can be a
                      challenge
                      for the casual user.

                      In addition to the commercial oXygen, there are a few other Java-based
                      text editors that you might explore. I need to look at them again.
                      Typically
                      such editors are based on Java Swing text widgets, can use TrueType or
                      OpenType fonts, and Java Input Methods. You can define your own Java
                      Input Methods, but it'll be hard if you're not a hacker. The freely
                      available kmap_ime.jar
                      and kmap_ime_gui.jar are Java-Input-Method wrappers that allow you to
                      use input methods expressed as Yudit-style .kmap files as if they were
                      Java Input Methods. (Yudit .kmap files are very similar in format
                      and semantics
                      to the vim keymap files.)

                      The Yudit editor is notable for its flexible handling of fonts,
                      rendering Unicode, and
                      allowing you to define your own input methods easily, but like
                      TextEdit it hardly seems
                      like a text editor at all to someone used to emacs or vim.

                      Traditionally I've used emacs, but emacs does not use Unicode
                      internally,
                      instead providing what I find to be an awkward and very incomplete
                      way of
                      mapping between its internal MULE-encoded internal representation and
                      Unicode files on input/output. In practice, the set of input methods
                      available for emacs is MULE-based and closed. emacs has seriously
                      dragged its
                      feet on Unicode implementation.

                      When it comes to Unicode implementation, vim is (in my opinion) much
                      more
                      promising than emacs. Vim seems to do an excellent internal job of
                      reading, editing,
                      and writing Unicode. Vim keymaps, for typing in Unicode chars, are
                      _very_
                      easy to define or modify, and they fit my needs perfectly. The
                      remaining problems
                      (from my point of view) with vim are these

                      1. Failure to render Unicode characters from the supplementary area
                      (I can't
                      edit a screen full of question marks)
                      2. The limitation to fixed-width fonts (A profound nuisance/
                      limitation. Vim
                      on Linux can use variable-width fonts, but it still works much better
                      with fixed-
                      width fonts.)

                      On Linux, consider Java-based solutions such as oXygen. In Gnome
                      there's
                      gedit, but (the last time I looked) the definition and addition of
                      new input
                      methods for gedit was poorly documented and required some background
                      hacking. I managed it once, but it's not acceptably easy or acceptably
                      documented, in my opinion.

                      I'm not acquainted with KDE (the alternative to Gnome in Linux). Is
                      anyone
                      out there acquainted with the kedit editor?

                      I'm not acquainted with Microsoft/PC solutions.

                      I need to look at OpenOffice solutions.

                      Corrections/Comments/Suggestions would be Very Welcome

                      I don't have an axe to grind--I just need to edit Unicode (including
                      Arabic,
                      Cyrillic, Supplementary Characters, Combining Diacritics) and I
                      insist on
                      being able to write my own input methods. I'd like a solution (with
                      input
                      methods) that works across multiple operating systems. I'd like to use
                      TrueType/OpenType fonts, without a fixed-width limitation, and be
                      able to
                      use virtual fonts that combine glyphs from a set of user-designated real
                      fonts. And I want a full-featured user-interface like that in vim or
                      emacs.

                      I would welcome pointers to other Unicode-editing solutions that I
                      may have overlooked.

                      Ken




                      On 31 Aug 2007, at 15:48, Eze wrote:

                      >
                      > Thanks a lot to you both for your insights. Ken, if I may ask, what
                      > exactly do you use to see/work with/input unicode characters? All
                      > information will be appreciated, such as linux distribution, desktop
                      > manager, text editor, etcetera.
                      >
                      > Best regards,
                      >
                      > Eze
                      >
                      >
                      > >


                      --~--~---------~--~----~------------~-------~--~----~
                      You received this message from the "vim_multibyte" maillist.
                      For more information, visit http://www.vim.org/maillist.php
                      -~----------~----~----~----~------~----~------~--~---
                    • Kenneth Beesley
                      Tony, If I were just typing in a Supplementary character here and there, or even an isolated word, I would use a similar solution. However, I m editing
                      Message 10 of 13 , Sep 4, 2007
                      • 0 Attachment
                        Tony,

                        If I were just typing in a Supplementary character here and there,
                        or even an isolated word, I would use a similar solution.

                        However, I'm editing (proofreading) chapter-length texts consisting of
                        supplementary characters, and when I open such a text in vim and
                        see nothing but a screenful of question marks, you can imagine
                        my disappointment.

                        Best wishes,

                        Ken



                        On 31 Aug 2007, at 23:19, Tony Mechelynck wrote:

                        >
                        > Eze wrote:
                        >> Thanks a lot to you both for your insights. Ken, if I may ask, what
                        >> exactly do you use to see/work with/input unicode characters? All
                        >> information will be appreciated, such as linux distribution, desktop
                        >> manager, text editor, etcetera.
                        >>
                        >> Best regards,
                        >>
                        >> Eze
                        >
                        > I don't know what Ken does, but I use gvim to input any unicode
                        > codepoints,
                        > and any browser (Firefox, SeaMonkey, or, depending on platform,
                        > Konqueror, IE,
                        > Safari, etc.) to visualise those outside the BMP.
                        >
                        >
                        > Best regards,
                        > Tony.
                        > --
                        > Boob's Law:
                        > You always find something in the last place you look.
                        >
                        > >


                        --~--~---------~--~----~------------~-------~--~----~
                        You received this message from the "vim_multibyte" maillist.
                        For more information, visit http://www.vim.org/maillist.php
                        -~----------~----~----~----~------~----~------~--~---
                      • Nico Weber
                        Hi, ... This should be fixed with the current svn version, at least for gvim (if you have the necessary fonts). Nico
                        Message 11 of 13 , Sep 23, 2007
                        • 0 Attachment
                          Hi,

                          > I'd like to switch over to vim, but I work a lot with exotic Unicode
                          > characters
                          > in the supplementary area. When I last looked into vim, and
                          > experimented
                          > with keymaps, I found that I could easily enter any Unicode char, and
                          > save
                          > the results to file---and the Unicode chars in the file were
                          > correct. But as
                          > long as I couldn't _see_ my character glyphs rendered on the screen,
                          > vim wasn't
                          > acceptable as an editor. All I could see were boxes (or question
                          > marks--I can't
                          > remember which).

                          This should be fixed with the current svn version, at least for gvim
                          (if you have the necessary fonts).

                          Nico

                          --~--~---------~--~----~------------~-------~--~----~
                          You received this message from the "vim_multibyte" maillist.
                          For more information, visit http://www.vim.org/maillist.php
                          -~----------~----~----~----~------~----~------~--~---
                        • Kenneth Beesley
                          Nico, This is great news. Many thanks for the message. Ken ... --~--~---------~--~----~------------~-------~--~----~ You received this message from the
                          Message 12 of 13 , Sep 23, 2007
                          • 0 Attachment
                            Nico,

                            This is great news. Many thanks for the message.

                            Ken


                            On 23 Sep 2007, at 09:46, Nico Weber wrote:

                            >
                            > Hi,
                            >
                            >> I'd like to switch over to vim, but I work a lot with exotic Unicode
                            >> characters
                            >> in the supplementary area. When I last looked into vim, and
                            >> experimented
                            >> with keymaps, I found that I could easily enter any Unicode char, and
                            >> save
                            >> the results to file---and the Unicode chars in the file were
                            >> correct. But as
                            >> long as I couldn't _see_ my character glyphs rendered on the screen,
                            >> vim wasn't
                            >> acceptable as an editor. All I could see were boxes (or question
                            >> marks--I can't
                            >> remember which).
                            >
                            > This should be fixed with the current svn version, at least for gvim
                            > (if you have the necessary fonts).
                            >
                            > Nico
                            >
                            > >


                            --~--~---------~--~----~------------~-------~--~----~
                            You received this message from the "vim_multibyte" maillist.
                            For more information, visit http://www.vim.org/maillist.php
                            -~----------~----~----~----~------~----~------~--~---
                          • Tony Mechelynck
                            ... Yes: even for people who don t use SVN (but CVS, A-A-P, ftp, whatever), it is patch 7.1.116, and works for me. Best regards, Tony. -- Water? Never touch
                            Message 13 of 13 , Sep 23, 2007
                            • 0 Attachment
                              Nico Weber wrote:
                              > Hi,
                              >
                              >> I'd like to switch over to vim, but I work a lot with exotic Unicode
                              >> characters
                              >> in the supplementary area. When I last looked into vim, and
                              >> experimented
                              >> with keymaps, I found that I could easily enter any Unicode char, and
                              >> save
                              >> the results to file---and the Unicode chars in the file were
                              >> correct. But as
                              >> long as I couldn't _see_ my character glyphs rendered on the screen,
                              >> vim wasn't
                              >> acceptable as an editor. All I could see were boxes (or question
                              >> marks--I can't
                              >> remember which).
                              >
                              > This should be fixed with the current svn version, at least for gvim
                              > (if you have the necessary fonts).
                              >
                              > Nico

                              Yes: even for people who don't use SVN (but CVS, A-A-P, ftp, whatever), it is
                              patch 7.1.116, and works for me.


                              Best regards,
                              Tony.
                              --
                              "Water? Never touch the stuff! Fish fuck in it."
                              -- W. C. Fields

                              --~--~---------~--~----~------------~-------~--~----~
                              You received this message from the "vim_multibyte" maillist.
                              For more information, visit http://www.vim.org/maillist.php
                              -~----------~----~----~----~------~----~------~--~---
                            Your message has been successfully submitted and would be delivered to recipients shortly.