Loading ...
Sorry, an error occurred while loading the content.

Bug report: mapping fails with a few c haracters (i.e.: « :imap ’ foo » fails)

Expand Messages
  • thomas
    Hi Mapping seems to be buggy with some characters. ... does not work (the apostrophe is U+2019). If the mapped string contains this apostrophe but does not
    Message 1 of 8 , Mar 1, 2007
    • 0 Attachment
      Hi

      Mapping seems to be buggy with some characters.
      For instance:

      :imap ' foo

      does not work (the apostrophe is U+2019). If the mapped string
      contains this apostrophe but does not begin with it, there is no
      problem. For instance, this works:

      :imap x' foo

      But it is impossible to map a string beginning with this apostrophe,
      and the <Char-0x2019> construct does not help. This is weird, because
      there is no problem with mapping the usual apostrophe U+0027. Other
      paradoxes can be found : It is possible to map the no-break space
      (U+00A0) but not its thin version (U+202F), the usual minus sign
      (U+002D) but not the en-dash (U+2013) and the em-dash (U+2014).

      I first thought vim had a problem with mapping multibyte characters
      but it actually deals well with most of them. Any explanation, why the
      mapping does not work with some characters?

      Thanks for your help

      Thomas
    • A.J.Mechelynck
      ... What is encoding set to? Using multibyte characters (e.g. in a mapping) will only work if encoding (which defines how characters are represented
      Message 2 of 8 , Mar 1, 2007
      • 0 Attachment
        thomas wrote:
        > Hi
        >
        > Mapping seems to be buggy with some characters.
        > For instance:
        >
        > :imap ' foo
        >
        > does not work (the apostrophe is U+2019). If the mapped string
        > contains this apostrophe but does not begin with it, there is no
        > problem. For instance, this works:
        >
        > :imap x' foo
        >
        > But it is impossible to map a string beginning with this apostrophe,
        > and the <Char-0x2019> construct does not help. This is weird, because
        > there is no problem with mapping the usual apostrophe U+0027. Other
        > paradoxes can be found : It is possible to map the no-break space
        > (U+00A0) but not its thin version (U+202F), the usual minus sign
        > (U+002D) but not the en-dash (U+2013) and the em-dash (U+2014).
        >
        > I first thought vim had a problem with mapping multibyte characters
        > but it actually deals well with most of them. Any explanation, why the
        > mapping does not work with some characters?
        >
        > Thanks for your help
        >
        > Thomas
        >

        What is 'encoding' set to? Using multibyte characters (e.g. in a mapping) will
        only work if 'encoding' (which defines how characters are represented
        internally in Vim memory) is set to an appropriate multibyte setting
        beforehand, for instance like this:

        if has("multi_byte")
        if &enc !=? '^u' " if already Unicode, no need to set it again
        if &tenc == ""
        " avoid clobbering the keyboard encoding
        let &tenc = &enc
        endif
        set enc=utf-8
        endif
        setglobal bomb fenc=latin1 " defaults for new files
        scriptencoding utf-8
        imap <Char-0x2019> foo
        else
        echomsg "Warning: No multibyte support compiled-in!"
        endif



        Best regards,
        Tony.
        --
        "A radioactive cat has eighteen half-lives."
      • thomas
        ... The encoding is set to utf-8. My point is, mapping works with some ... ... that is, mapping the forall symbol (U+2200), works. Since I can map some
        Message 3 of 8 , Mar 2, 2007
        • 0 Attachment
          2007/3/2, A.J.Mechelynck <antoine.mechelynck@...>:
          > > Mapping seems to be buggy with some characters.
          > > For instance:
          > >
          > > :imap ' foo
          > >
          > > does not work (the apostrophe is U+2019).
          >
          > What is 'encoding' set to? Using multibyte characters (e.g. in a mapping) will
          > only work if 'encoding' (which defines how characters are represented
          > internally in Vim memory) is set to an appropriate multibyte setting
          > beforehand

          The encoding is set to utf-8. My point is, mapping works with some
          multibyte characters, but not all of them. For example:

          :imap ∀ foo

          ... that is, mapping the "forall" symbol (U+2200), works.

          Since I can map some multibyte characters, there is in my opinion no
          issue with the encoding.
          The question is: why is it possible to map U+2200 but not U+2019?

          Regards,
          thomas
        • A.J.Mechelynck
          ... Hm. It just might be a bug, but Bram would be better able than me to check this. I can map but I cannot really test that it works (I can just
          Message 4 of 8 , Mar 2, 2007
          • 0 Attachment
            thomas wrote:
            > 2007/3/2, A.J.Mechelynck <antoine.mechelynck@...>:
            >> > Mapping seems to be buggy with some characters.
            >> > For instance:
            >> >
            >> > :imap ' foo
            >> >
            >> > does not work (the apostrophe is U+2019).
            >>
            >> What is 'encoding' set to? Using multibyte characters (e.g. in a
            >> mapping) will
            >> only work if 'encoding' (which defines how characters are represented
            >> internally in Vim memory) is set to an appropriate multibyte setting
            >> beforehand
            >
            > The encoding is set to utf-8. My point is, mapping works with some
            > multibyte characters, but not all of them. For example:
            >
            > :imap ∀ foo
            >
            > ... that is, mapping the "forall" symbol (U+2200), works.
            >
            > Since I can map some multibyte characters, there is in my opinion no
            > issue with the encoding.
            > The question is: why is it possible to map U+2200 but not U+2019?
            >
            > Regards,
            > thomas
            >

            Hm. It just might be a bug, but Bram would be better able than me to check this.

            I can map <Char-0x2019> but I cannot really test that it works (I can just see
            that it is represented correctly in the list of mappings), because that
            character is not on my keyboard.

            U+2019 is encoded as E2 80 99 while U+2200 is E2 88 80. I wonder if the
            presence of a 0x80 in the middle might cause a bug in gvim.

            Did you try the code snippet in my previous post? If it works, then we can
            start digging why your previous method didn't work. If it doesn't, we should
            have a clear-cut testcase for others to try and reproduce.


            Best regards,
            Tony.
            --
            hundred-and-one symptoms of being an internet addict:
            80. At parties, you introduce your spouse as your "service provider."
          • thomas
            ... You do not need to type this character to test the mapping, you can simply paste it with the mouse, while being in Insert mode. (By the way, there is a
            Message 5 of 8 , Mar 2, 2007
            • 0 Attachment
              2007/3/2, A.J.Mechelynck <antoine.mechelynck@...>:
              > I can map <Char-0x2019> but I cannot really test that it works (I can just see
              > that it is represented correctly in the list of mappings), because that
              > character is not on my keyboard.

              You do not need to type this character to test the mapping, you can
              simply paste it with the mouse, while being in Insert mode. (By the
              way, there is a useful package in Debian simply called "unicode" which
              allows to look up easily for unicode characters from the shell
              prompt.)

              > U+2019 is encoded as E2 80 99 while U+2200 is E2 88 80. I wonder if the
              > presence of a 0x80 in the middle might cause a bug in gvim.

              The narrow no-break space is e2 80 af ; the en-dash, e2 80 93 ; the
              em-dash, e2 80 94 ; the hyphen, e2 80 90 ; and the mapping of these
              characters fails, so you might have a valid explanation, Tony.

              > Did you try the code snippet in my previous post?

              Yes. It did not report any error, the apostrophe appears in the list
              of mappings but when I type it, it is not replaced with "foo".

              Thanks
              Thomas
            • Bram Moolenaar
              ... How can you tell if the mapping works or not? You can see what a key actually produces with CTRL-V . So when you type ... Where CTRL-V is one key
              Message 6 of 8 , Mar 2, 2007
              • 0 Attachment
                Thomas wrote:

                > Mapping seems to be buggy with some characters.
                > For instance:
                >
                > :imap ' foo
                >
                > does not work (the apostrophe is U+2019). If the mapped string
                > contains this apostrophe but does not begin with it, there is no
                > problem. For instance, this works:
                >
                > :imap x' foo
                >
                > But it is impossible to map a string beginning with this apostrophe,
                > and the <Char-0x2019> construct does not help. This is weird, because
                > there is no problem with mapping the usual apostrophe U+0027. Other
                > paradoxes can be found : It is possible to map the no-break space
                > (U+00A0) but not its thin version (U+202F), the usual minus sign
                > (U+002D) but not the en-dash (U+2013) and the em-dash (U+2014).
                >
                > I first thought vim had a problem with mapping multibyte characters
                > but it actually deals well with most of them. Any explanation, why the
                > mapping does not work with some characters?

                How can you tell if the mapping works or not?

                You can see what a key actually produces with CTRL-V <key> . So when
                you type
                :imap CTRL-V <key> foo

                Where CTRL-V is one key and <key> is the mapped key.
                Does the mapping still not work?

                --
                hundred-and-one symptoms of being an internet addict:
                237. You tattoo your email address on your forehead.

                /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
                /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
                \\\ download, build and distribute -- http://www.A-A-P.org ///
                \\\ help me help AIDS victims -- http://ICCF-Holland.org ///
              • A.J.Mechelynck
                ... After testing all Alt-Gr combinations, I find that AltGr-Shift-b is ’ U+2019, your apostrophe-like symbol, and I can reproduce your problem: Bug report
                Message 7 of 8 , Mar 2, 2007
                • 0 Attachment
                  thomas wrote:
                  > 2007/3/2, A.J.Mechelynck <antoine.mechelynck@...>:
                  >> I can map <Char-0x2019> but I cannot really test that it works (I can
                  >> just see
                  >> that it is represented correctly in the list of mappings), because that
                  >> character is not on my keyboard.
                  >
                  > You do not need to type this character to test the mapping, you can
                  > simply paste it with the mouse, while being in Insert mode. (By the
                  > way, there is a useful package in Debian simply called "unicode" which
                  > allows to look up easily for unicode characters from the shell
                  > prompt.)
                  >
                  >> U+2019 is encoded as E2 80 99 while U+2200 is E2 88 80. I wonder if the
                  >> presence of a 0x80 in the middle might cause a bug in gvim.
                  >
                  > The narrow no-break space is e2 80 af ; the en-dash, e2 80 93 ; the
                  > em-dash, e2 80 94 ; the hyphen, e2 80 90 ; and the mapping of these
                  > characters fails, so you might have a valid explanation, Tony.
                  >
                  >> Did you try the code snippet in my previous post?
                  >
                  > Yes. It did not report any error, the apostrophe appears in the list
                  > of mappings but when I type it, it is not replaced with "foo".
                  >
                  > Thanks
                  > Thomas
                  >

                  After testing all Alt-Gr combinations, I find that AltGr-Shift-b is ’ U+2019,
                  your apostrophe-like symbol, and I can reproduce your problem:

                  Bug report
                  ==========
                  Summary: 3-byte UTF-8 codepoints whose middle byte is 0x80 are not recognised
                  to invoke a mapping.

                  Description:
                  When 'encoding' is UTF-8, if codepoints whose UTF-8 representation includes
                  0x80 in the second of three bytes (such as U+2018, upper-6 quote; U+2019,
                  upper-9 quote; U+201C double upper-6 quote; U+201D double upper-9 quote) are
                  used as the {lhs} of a mapping, these mappings will appear in the list, but
                  they will not be recognised when typed at the keyboard.

                  Vim Version affected:
                  VIM - Vi IMproved 7.0 (2006 May 7, compiled Feb 27 2007 23:05:12)
                  Included patches: 1-204
                  Compiled by antoine.mechelynck@...
                  Huge version with GTK2-GNOME GUI. Features included (+) or not (-):
                  [...]

                  Reproducible: Every time.
                  Steps to reproduce:
                  1. Find a computer whose keyboard has, available at the keyboard, at least one
                  character in the range U+2000 to U+203F. (On openSUSE Linux 10.2 with kde and
                  fr_BE "azerty" keyboard, AltGr-v, AltGr-V, AltGr-b and Alt-Gr-B are such
                  codepoints.)
                  2. Make sure that your version of gvim has +multi_byte and that 'encoding' is
                  set to UTF-8.
                  3. Define a mapping with such a codepoint in the {lhs}. Example:

                  :map! ’ foo

                  where the {lhs} is U+2019, upper-9 single quote.
                  4. List the mappings (":map!" with no argument). Notice that the mapping
                  defined at step 2 is listed.
                  5. In Insert mode, hit the key corresponding to the {lhs} of the new mapping
                  (here: AltGr-Shift-b to trigger ’).

                  Actual behaviour: The character is inserted literally; the mapping is not invoked.

                  Expected behaviour: The mapping should have been invoked.
                  ==============
                  End bug report
                  ==============

                  Best regards,
                  Tony.
                  --
                  Have you ever wondered what makes Californians so calm? Besides drugs,
                  I mean. The answer is hot tubs. A hot tub is a redwood container
                  filled with water that you sit in naked with members of the opposite
                  sex, none of whom is necessarily your spouse. After a few hours in
                  their hot tubs, Californians don't give a damn about earthquakes or
                  mass murderers. They don't give a damn about anything , which is why
                  they are able to produce "Laverne and Shirley" week after week.
                  -- Dave Barry, "The Taming of the Screw"
                • A.J.Mechelynck
                  Bram Moolenaar wrote: [...] ... When I type ... where the {lhs}, U+201C (double high 6 quote) is produced on my keyboard by AltGr-v then, ... list the mapping
                  Message 8 of 8 , Mar 2, 2007
                  • 0 Attachment
                    Bram Moolenaar wrote:
                    [...]
                    > How can you tell if the mapping works or not?
                    >
                    > You can see what a key actually produces with CTRL-V <key> . So when
                    > you type
                    > :imap CTRL-V <key> foo
                    >
                    > Where CTRL-V is one key and <key> is the mapped key.
                    > Does the mapping still not work?
                    >

                    When I type

                    :map! “ -

                    where the {lhs}, U+201C (double high 6 quote) is produced on my keyboard by
                    AltGr-v

                    then,

                    :map!

                    list the mapping with “ (the character in question) in the {lhs}, but hitting
                    AltGr-v in Insert mode inserts “ (the {lhs}) not - (the {rhs}).

                    Using Ctrl-V before the key when defining the mapping makes no change: hitting
                    the key still doesn't invoke the mapping, but ":map! “" (again, with or
                    without Ctrl-V) lists it.


                    Best regards,
                    Tony.
                    --
                    The grand leap of the whale up the Fall of Niagara is esteemed, by all
                    who have seen it, as one of the finest spectacles in nature.
                    -- Benjamin Franklin.
                  Your message has been successfully submitted and would be delivered to recipients shortly.