52118Re: BUG: Unicode characters in commands
- Sep 29, 2008Matt Wozniski wrote:
> On Sun, Sep 28, 2008 at 4:35 PM, Tony Mechelynck wrote:I'll add it to the todo list. Don't expect a solution soon...
> >> On Sun, Sep 28, 2008 at 9:40 AM, John Hughes wrote:
> >>> I am trying to write a command that substitutes some Ascii characters
> >>> with a Unicode character. The following substitution works when
> >>> entered directly:
> >>> :%s/\.\.\./…/eg
> >>> However, when defined as a command, it does not work:
> >>> :com Ellipsis %s/\.\.\./…/eg
> >>> The command :Ellipsis converts
> >>> ...
> >>> into
> >>> â<80><fe>X¦
> >>> Why is this? Is there any way of using Unicode characters in
> >>> substitute commands?
> > I'm using gvim 7.2.21, huge build with Gnome2 GUI and 'encoding' set to
> > UTF-8. Just like the OP, I see the following:
> > - Typing the :s command at the command-line works OK.
> > - Defining that :s command as a user-command text, then running that
> > user command, replaces every set of three dots by â<80><fe>X¦ (5
> > characters including two invalid UTF-8 sequences, 7 bytes viz. C3 A2 80
> > FE 58 C2 A6).
> > - Recalling that command definition with ":command Ellipsis" displays
> > the ellipsis character as an ellipsis.
> > - The ellipsis is U+2026, in UTF-8 0xE2 0x80 0xA6. Notice that 80 and A6
> > appear (though not consecutively) as part of the replace-text actually
> > used, and that E2 is C3 A2 which also appears. This makes me suspect
> > that Vim is applying a spurious Latin1-to-UTF8 conversion to what is
> > already UTF-8 (with something wrong, maybe buffer-overflow, happening in
> > the middle). Another possibility would be using a "character length"
> > instead of a "byte length", or vice-versa, at some point in the
> > user-command execution.
> I can confirm this. It looks to me like it's not a spurious
> Latin1-UTF8 conversion, but an internally-escaped string that's not
> un-escaped before being used. Sourcediving, it seems that
> mb_unescape() is called to escape any multibyte characters when
> displaying the command, but that mb_unescape() is never called before
> the command is passed to do_cmdline() to be executed. That seems to
> explain why it's displayed properly but executed incorrectly. I don't
> completely follow all of the string escaping being done here, though,
> so Bram knows for sure. I've cross-posted to the vim-dev list
hundred-and-one symptoms of being an internet addict:
116. You are living with your boyfriend who networks your respective
computers so you can sit in separate rooms and email each other
/// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ download, build and distribute -- http://www.A-A-P.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///
You received this message from the "vim_dev" maillist.
For more information, visit http://www.vim.org/maillist.php
- << Previous post in topic