Loading ...
Sorry, an error occurred while loading the content.
 

substitute() with zero width pattern breaks multi-byte character.

Expand Messages
  • Yukihiro Nakadaira
    substitute() with zero width pattern breaks multi-byte character. Steps to reproduce: $ vim -u NONE ... x x x Please check the following patch. diff -r
    Message 1 of 4 , Jun 4, 2014
      substitute() with zero width pattern breaks multi-byte character.

      Steps to reproduce:
        $ vim -u NONE
        :set encoding=utf-8
        :echo substitute("\u00e1", '\zs', 'x', 'g')
        x<c3>x<a1>x

      Please check the following patch.


      diff -r bed71c37618c src/eval.c
      --- a/src/eval.c    Thu May 29 14:36:29 2014 +0200
      +++ b/src/eval.c    Wed Jun 04 20:44:48 2014 +0900
      @@ -24848,8 +24848,11 @@
               if (zero_width == regmatch.startp[0])
               {
                   /* avoid getting stuck on a match with an empty string */
      -            *((char_u *)ga.ga_data + ga.ga_len) = *tail++;
      -            ++ga.ga_len;
      +            i = MB_PTR2LEN(tail);
      +            mch_memmove((char_u *)ga.ga_data + ga.ga_len, tail,
      +                                    (size_t)i);
      +            ga.ga_len += i;
      +            tail += i;
                   continue;
               }
               zero_width = regmatch.startp[0];
      diff -r bed71c37618c src/testdir/test69.in
      --- a/src/testdir/test69.in    Thu May 29 14:36:29 2014 +0200
      +++ b/src/testdir/test69.in    Wed Jun 04 20:44:48 2014 +0900
      @@ -180,6 +180,13 @@
       byteidxcomp
       
       STARTTEST
      +/^substitute
      +:let y = substitute('123', '\zs', 'a', 'g')    | put =y
      +ENDTEST
      +
      +substitute
      +
      +STARTTEST
       :g/^STARTTEST/.,/^ENDTEST/d
       :1;/^Results/,$wq! test.out
       ENDTEST
      diff -r bed71c37618c src/testdir/test69.ok
      --- a/src/testdir/test69.ok    Thu May 29 14:36:29 2014 +0200
      +++ b/src/testdir/test69.ok    Wed Jun 04 20:44:48 2014 +0900
      @@ -160,3 +160,7 @@
       [0, 1, 3, 4, -1]
       [0, 1, 2, 4, 5, -1]
       
      +
      +substitute
      +a1a2a3a
      +

      --
      --
      You received this message from the "vim_dev" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php

      ---
      You received this message because you are subscribed to the Google Groups "vim_dev" group.
      To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
      For more options, visit https://groups.google.com/d/optout.
    • Christ van Willegen
      Hi, On Wed, Jun 4, 2014 at 1:47 PM, Yukihiro Nakadaira ... Shouldn t you check for #ifdef FEAT_MBYTE somewhere in this patch? Christ van Willegen -- 09 F9 11
      Message 2 of 4 , Jun 4, 2014
        Hi,

        On Wed, Jun 4, 2014 at 1:47 PM, Yukihiro Nakadaira
        <yukihiro.nakadaira@...> wrote:
        > substitute() with zero width pattern breaks multi-byte character.
        >
        > Please check the following patch.
        >
        > diff -r bed71c37618c src/eval.c
        > --- a/src/eval.c Thu May 29 14:36:29 2014 +0200
        > +++ b/src/eval.c Wed Jun 04 20:44:48 2014 +0900
        > @@ -24848,8 +24848,11 @@
        > if (zero_width == regmatch.startp[0])
        > {
        > /* avoid getting stuck on a match with an empty string */
        > - *((char_u *)ga.ga_data + ga.ga_len) = *tail++;
        > - ++ga.ga_len;
        > + i = MB_PTR2LEN(tail);
        > + mch_memmove((char_u *)ga.ga_data + ga.ga_len, tail,
        > + (size_t)i);
        > + ga.ga_len += i;
        > + tail += i;
        > continue;
        > }

        Shouldn't you check for #ifdef FEAT_MBYTE somewhere in this patch?

        Christ van Willegen
        --
        09 F9 11 02 9D 74 E3 5B D8 41 56 C5 63 56 88 C0

        --
        --
        You received this message from the "vim_dev" maillist.
        Do not top-post! Type your reply below the text you are replying to.
        For more information, visit http://www.vim.org/maillist.php

        ---
        You received this message because you are subscribed to the Google Groups "vim_dev" group.
        To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
        For more options, visit https://groups.google.com/d/optout.
      • Yukihiro Nakadaira
        On Wed, Jun 4, 2014 at 9:21 PM, Christ van Willegen ... I think MB_PTR2LEN macro is enough for it. -- Yukihiro Nakadaira -
        Message 3 of 4 , Jun 4, 2014
          On Wed, Jun 4, 2014 at 9:21 PM, Christ van Willegen <cvwillegen@...> wrote:
          Hi,

          On Wed, Jun 4, 2014 at 1:47 PM, Yukihiro Nakadaira
          <yukihiro.nakadaira@...> wrote:
          > substitute() with zero width pattern breaks multi-byte character.
          >
          > Please check the following patch.
          >
          > diff -r bed71c37618c src/eval.c
          > --- a/src/eval.c    Thu May 29 14:36:29 2014 +0200
          > +++ b/src/eval.c    Wed Jun 04 20:44:48 2014 +0900
          > @@ -24848,8 +24848,11 @@
          >          if (zero_width == regmatch.startp[0])
          >          {
          >              /* avoid getting stuck on a match with an empty string */
          > -            *((char_u *)ga.ga_data + ga.ga_len) = *tail++;
          > -            ++ga.ga_len;
          > +            i = MB_PTR2LEN(tail);
          > +            mch_memmove((char_u *)ga.ga_data + ga.ga_len, tail,
          > +                                    (size_t)i);
          > +            ga.ga_len += i;
          > +            tail += i;
          >              continue;
          >          }

          Shouldn't you check for #ifdef FEAT_MBYTE somewhere in this patch?

          I think MB_PTR2LEN macro is enough for it.
           
          --
          Yukihiro Nakadaira - yukihiro.nakadaira@...

          --
          --
          You received this message from the "vim_dev" maillist.
          Do not top-post! Type your reply below the text you are replying to.
          For more information, visit http://www.vim.org/maillist.php

          ---
          You received this message because you are subscribed to the Google Groups "vim_dev" group.
          To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
          For more options, visit https://groups.google.com/d/optout.
        • Bram Moolenaar
          ... Thanks! -- hundred-and-one symptoms of being an internet addict: 266. You hear most of your jokes via e-mail instead of in person. /// Bram Moolenaar --
          Message 4 of 4 , Jun 4, 2014
            Yukihiro Nakadaira wrote:

            > substitute() with zero width pattern breaks multi-byte character.
            >
            > Steps to reproduce:
            > $ vim -u NONE
            > :set encoding=3Dutf-8
            > :echo substitute("\u00e1", '\zs', 'x', 'g')
            > x<c3>x<a1>x
            >
            > Please check the following patch.

            Thanks!

            --
            hundred-and-one symptoms of being an internet addict:
            266. You hear most of your jokes via e-mail instead of in person.

            /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
            /// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
            \\\ an exciting new programming language -- http://www.Zimbu.org ///
            \\\ help me help AIDS victims -- http://ICCF-Holland.org ///

            --
            --
            You received this message from the "vim_dev" maillist.
            Do not top-post! Type your reply below the text you are replying to.
            For more information, visit http://www.vim.org/maillist.php

            ---
            You received this message because you are subscribed to the Google Groups "vim_dev" group.
            To unsubscribe from this group and stop receiving emails from it, send an email to vim_dev+unsubscribe@....
            For more options, visit https://groups.google.com/d/optout.
          Your message has been successfully submitted and would be delivered to recipients shortly.