Re: multibyte in patterns
- Benji Fisher wrote:
> so it looks pretty good to me. The second =~ test is a little strange,Thanks for testing. It's a matter of taste whether foo =~ bar should
> but should probably work this way for backward compatibility.
result in TRUE of FALSE. Let's just leave it as it is until someone has
a good reason why it should be different.
> On the question of changing "\x" or adding "\u":There are always exceptions, e.g. when 'encoding' is not properly set or
> * Since vim is a *text* editor, I am not convinced that it should be
> able to enter invalid bytes into my document. (I admit that
> :put=\"xe4\" does not count as entering a character *easily*.) Perhaps
> it would be better to make "\x" act like the new "\u" after all.
when intentionally creating illegal bytes. I don't think we have a good
reason to forbid inserting any byte value.
> * By habit and because of legacy scripts, people will continue to useExisting scripts that use "\xab" to insert valid UTF-8 bytes should keep
> "\x". I assume that the new "\u" will be recommended for most purposes
> (and the docs will mention this). It will take a while for people to
> adjust. Again, this argues for using "\x" to insert valid bytes, and
> adding a new construct for arbitrary bytes.
on working, that's another reason why changing the meaning of "\xab" is
a bad idea.
> Final question: I want my script to be able to insert "«" withoutIf iconv() is supported it should work. So long as 'encoding' does
> forcing users to adopt the latest patched vim. (I am thinking of the
> LaTeX suite.) Instead of
> :let foo = "\uab"
> with this patch, should
> :let foo = iconv("\xab", "latin1", &enc)
> have the same effect? It seems to work, as far as I can tell.
support a character to represent the latin1 "\xab" character (not all
8-bit encodings have it).
hundred-and-one symptoms of being an internet addict:
269. You receive an e-mail from the wife of a deceased president, offering
to send you twenty million dollar, and you are not even surprised.
/// Bram Moolenaar -- Bram@... -- http://www.moolenaar.net \\\
/// Creator of Vim - Vi IMproved -- http://www.vim.org \\\
\\\ Project leader for A-A-P -- http://www.a-a-p.org ///
\\\ Lord Of The Rings helps Uganda - http://iccf-holland.org/lotr.html ///
- Antoine J. Mechelynck wrote:
> Benji Fisher <benji@...> wrote:No, I have only tried it with utf-8 and latin1. What other
>> Final question: I want my script to be able to insert "«" without
>>forcing users to adopt the latest patched vim. (I am thinking of the
>>LaTeX suite.) Instead of
>>>let foo = "\uab"
>>with this patch, should
>>>let foo = iconv("\xab", "latin1", &enc)
>>have the same effect? It seems to work, as far as I can tell.
> have you tried it with encodings for which there is no equivalent for that
> latin-1 character? (Iconv fails: what happens then?)
encodings should I try?
> Best wishes -- and a happy New YearThanks!
- Benji Fisher <benji@...> wrote:
> Antoine J. Mechelynck wrote:[...]
> > have you tried it with encodings for which there is no equivalent forAs many as possible, of course; but this is not really an answer. Maybe you
> > that latin-1 character? (Iconv fails: what happens then?)
> No, I have only tried it with utf-8 and latin1. What other
> encodings should I try?
could start, if you have them, with Central-European and Turkish encodings,
then if it works OK, with more esoteric ones like Greek, Cyrillic, Big5,
sjis, euc-kr,... and wouldn't digraphs << and >> need to be switched around
for right-to-left languages like Hebrew, Farsi and Arabic? -- As you see,
I'm thinking of what the plugin would need to be as general as possible, for
as many users as possible. Also, as could be inferred from Bram's post of a
few minutes ago, mybe there ought to be a fallback if iconv() fails for any
reason, and in particular for if ! has("iconv")...
> > Best wishes -- and a happy New Year
> > Tony.
> --Benji Fisher