Loading ...
Sorry, an error occurred while loading the content.

[vim-multibyte] Re: multibyte patch

Expand Messages
  • Bram Moolenaar
    ... I understand the problem now. How about this patch to fix it: ... *************** ... { for (;;) { + #ifdef MULTI_BYTE + if (is_dbcs && dir 0 &&
    Message 1 of 11 , Mar 24, 2000
    • 0 Attachment
      Sung-Hyun Nam wrote:

      > Guess all the char is multibyte char.
      >
      > XY XY XY
      >
      > Guess cursor is on the first X. Now type 'f '.
      > by 'col += dir', it moves first 'Y'.
      > Because vim checks 'dir < 0' for trail byte,
      > Now cursor moves to ' ' by IsLeadByte().
      > by 'col += dir', cursor moves to 2nd 'X' (skips first ' ').
      > by IsLeadByte, cursor moves to second 'Y'.
      > by col += dir, cursor moves to second ' '.
      > and break by p[col]==c.

      I understand the problem now. How about this patch to fix it:

      *** search.c~ Sat Jan 15 21:12:19 2000
      --- search.c Fri Mar 24 11:08:45 2000
      ***************
      *** 1090,1095 ****
      --- 1090,1099 ----
      {
      for (;;)
      {
      + #ifdef MULTI_BYTE
      + if (is_dbcs && dir > 0 && IsLeadByte(p[col]))
      + ++col; /* advance two bytes for multibyte char */
      + #endif
      if ((col += dir) < 0 || col >= len)
      return FALSE;
      #ifdef MULTI_BYTE
      ***************
      *** 1098,1107 ****
      #endif
      if (p[col] == c)
      break;
      - #ifdef MULTI_BYTE
      - if (is_dbcs && dir > 0 && IsLeadByte(p[col]))
      - ++col; /* skip multibyte's trail byte */
      - #endif
      }
      }
      if (type)
      --- 1102,1107 ----

      The idea is that, when moving forward, when the cursor is currently on a
      leading byte, one extra column is skipped to avoid the trailing byte. When
      moving backwards, a trailing byte is detected and skipped (this didn't change).

      --
      MAN: Fetchez la vache!
      GUARD: Quoi?
      MAN: Fetchez la vache!
      "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

      /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
      \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
    • Park Chong-Dae
      ... How about this patch? Chong-Dae Park ... +++ search.c Fri Mar 24 20:11:25 2000 @@ -1090,18 +1090,16 @@ { for (;;) { - if ((col += dir) =
      Message 2 of 11 , Mar 24, 2000
      • 0 Attachment
        On Fri, Mar 24, 2000 at 11:52:16AM +0100, Bram Moolenaar wrote:
        > The idea is that, when moving forward, when the cursor is currently on a
        > leading byte, one extra column is skipped to avoid the trailing byte. When
        > moving backwards, a trailing byte is detected and skipped (this didn't change).

        How about this patch?

        Chong-Dae Park
      • Sung-Hyun Nam
        ... It seems work fine. namsh
        Message 3 of 11 , Mar 24, 2000
        • 0 Attachment
          Bram Moolenaar wrote:
          >
          > Sung-Hyun Nam wrote:
          >
          > > Guess all the char is multibyte char.
          > >
          > > XY XY XY
          > >
          > > Guess cursor is on the first X. Now type 'f '.
          > > by 'col += dir', it moves first 'Y'.
          > > Because vim checks 'dir < 0' for trail byte,
          > > Now cursor moves to ' ' by IsLeadByte().
          > > by 'col += dir', cursor moves to 2nd 'X' (skips first ' ').
          > > by IsLeadByte, cursor moves to second 'Y'.
          > > by col += dir, cursor moves to second ' '.
          > > and break by p[col]==c.
          >
          > I understand the problem now. How about this patch to fix it:

          It seems work fine.

          namsh
        • Taro Muraoka
          Hello, ... I understand too. And test Bram s patch, it works very well. This bug is a my mistake. Sorry all. ... Taro Muraoka
          Message 4 of 11 , Mar 24, 2000
          • 0 Attachment
            Hello,

            > Sung-Hyun Nam wrote:
            >
            > > Guess all the char is multibyte char.
            > >
            > > XY XY XY
            > >
            > > Guess cursor is on the first X. Now type 'f '.
            > > by 'col += dir', it moves first 'Y'.
            > > Because vim checks 'dir < 0' for trail byte,
            > > Now cursor moves to ' ' by IsLeadByte().
            > > by 'col += dir', cursor moves to 2nd 'X' (skips first ' ').
            > > by IsLeadByte, cursor moves to second 'Y'.
            > > by col += dir, cursor moves to second ' '.
            > > and break by p[col]==c.
            >
            > I understand the problem now. How about this patch to fix it:

            I understand too. And test Bram's patch, it works very well.
            This bug is a my mistake. Sorry all.
            ----
            Taro Muraoka <koron@...>
          • Sung-Hyun Nam
            ... Your patch yet has the problem that Bram pointed out... I think, col could be advanced beyond of the line for MULTI_BYTE. namsh
            Message 5 of 11 , Mar 24, 2000
            • 0 Attachment
              Park Chong-Dae wrote:
              >
              > On Fri, Mar 24, 2000 at 11:52:16AM +0100, Bram Moolenaar wrote:
              > > The idea is that, when moving forward, when the cursor is
              > > currently on a leading byte, one extra column is skipped to avoid
              > > the trailing byte. When moving backwards, a trailing byte is
              > > detected and skipped (this didn't change).
              >
              > How about this patch?

              Your patch yet has the problem that Bram pointed out...
              I think, 'col' could be advanced beyond of the line for MULTI_BYTE.

              namsh
            • Park Chong-Dae
              ... Oops.. I forgot the problem. However Bram s code is more efficient than mine. Because IsLeadByte() is faster than IsTrailByte(). So I make another patch.
              Message 6 of 11 , Mar 24, 2000
              • 0 Attachment
                On Sat, Mar 25, 2000 at 12:57:35AM +0900, Sung-Hyun Nam wrote:
                > Your patch yet has the problem that Bram pointed out...
                > I think, 'col' could be advanced beyond of the line for MULTI_BYTE.

                Oops.. I forgot the problem.
                However Bram's code is more efficient than mine. Because IsLeadByte() is
                faster than IsTrailByte(). So I make another patch.
                This is REAL multibyte patch for searchc().
                Now, you can search DBCS using f or F correctly.

                Chong-Dae Park
              • Bram Moolenaar
                ... This looks OK to me. But let s hear it from people actually using double-byte characters. ... I think this can be changed to: if (type) { /* backup to
                Message 7 of 11 , Mar 24, 2000
                • 0 Attachment
                  Park Chong-Dae wrote:

                  > Oops.. I forgot the problem.
                  > However Bram's code is more efficient than mine. Because IsLeadByte() is
                  > faster than IsTrailByte(). So I make another patch.
                  > This is REAL multibyte patch for searchc().
                  > Now, you can search DBCS using f or F correctly.

                  This looks OK to me. But let's hear it from people actually using double-byte
                  characters.

                  One simplification that appears to be possible:

                  > if (type)
                  > + {
                  > col -= dir;
                  > +#ifdef MULTI_BYTE
                  > + if (is_dbcs && IsTrailByte(p, &p[col]))
                  > + col -= dir; /* skip multibyte's trail byte */
                  > +#endif
                  > + }

                  I think this can be changed to:

                  if (type)
                  {
                  /* backup to before the character (possibly double-byte) */
                  col -= dir;
                  #ifdef MULTI_BYTE
                  if (char_bytes == 2)
                  col -= dir;
                  #endif
                  }

                  This avoids a IsTrailByte() call, which can be slow. But please check that
                  this is correct.

                  --
                  He was not in the least bit scared to be mashed into a pulp
                  Or to have his eyes gouged out and his elbows broken;
                  To have his kneecaps split and his body burned away
                  And his limbs all hacked and mangled, brave Sir Robin.
                  "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

                  /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
                  \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
                • Park Chong-Dae
                  ... I think it does not work. Suppose this setting. XY AB Guess you are on the first X . When typing the t command, it moves to space first. Now type ==
                  Message 8 of 11 , Mar 25, 2000
                  • 0 Attachment
                    On Fri, Mar 24, 2000 at 09:31:18PM +0100, Bram Moolenaar wrote:
                    > This looks OK to me. But let's hear it from people actually using double-byte
                    > characters.
                    >
                    > One simplification that appears to be possible:
                    >
                    > > if (type)
                    > > + {
                    > > col -= dir;
                    > > +#ifdef MULTI_BYTE
                    > > + if (is_dbcs && IsTrailByte(p, &p[col]))
                    > > + col -= dir; /* skip multibyte's trail byte */
                    > > +#endif
                    > > + }
                    >
                    > I think this can be changed to:
                    >
                    > if (type)
                    > {
                    > /* backup to before the character (possibly double-byte) */
                    > col -= dir;
                    > #ifdef MULTI_BYTE
                    > if (char_bytes == 2)
                    > col -= dir;
                    > #endif
                    > }
                    >
                    > This avoids a IsTrailByte() call, which can be slow. But please check that
                    > this is correct.

                    I think it does not work.

                    Suppose this setting.

                    XY AB

                    Guess you are on the first 'X'.
                    When typing the 't ' command, it moves to space first.
                    Now "type == TURE", it moves to opposite direction(in this case, backward).
                    The cursor goes to Y position, where is the second byte of "XY".
                    (But char_bytes == 1). This occurs in forward search only. In backward search
                    case('T' cmannd cases: the opposite movement is now forward direction),
                    your code would work faster. However this routine is called only once for each
                    call. So I choose the simple case. I think the performance is not important
                    here.

                    For the performance, mb_byte1(), mb_byte2(), and IsTrailByte() should be
                    avoided. In my patch submitted yesterday(screen.c patch for screen_line()),
                    I can remove mb_isbyte1() call. I'll post the patch if the submitted one
                    does not have any problem. (My new patch would be just a performance tuning.)

                    --
                    Chong-Dae Park
                    --
                    Han Solo: That's 'cause droids don't pull people's arms out of their sockets
                    when they lose. Wookiees are known to do that.
                    C-3PO: I see your point, sir. I suggest a new strategy, R2! let the Wookiee
                    win.
                    - from "Star Wars: A New Hope"
                  • Bram Moolenaar
                    ... Thanks for checking. ... Well, when using long lines this can be important anyway. And it s easy to make it like this: if (type) { /* backup to before the
                    Message 9 of 11 , Mar 25, 2000
                    • 0 Attachment
                      Park Chong-Dae wrote:

                      > > This avoids a IsTrailByte() call, which can be slow. But please check that
                      > > this is correct.
                      >
                      > I think it does not work.

                      Thanks for checking.

                      > XY AB
                      >
                      > Guess you are on the first 'X'.
                      > When typing the 't ' command, it moves to space first.
                      > Now "type == TURE", it moves to opposite direction(in this case, backward).
                      > The cursor goes to Y position, where is the second byte of "XY".
                      > (But char_bytes == 1). This occurs in forward search only. In backward search
                      > case('T' cmannd cases: the opposite movement is now forward direction),
                      > your code would work faster. However this routine is called only once for each
                      > call. So I choose the simple case. I think the performance is not important
                      > here.

                      Well, when using long lines this can be important anyway. And it's easy to
                      make it like this:

                      if (type)
                      {
                      /* backup to before the character (possibly double-byte) */
                      col -= dir;
                      #ifdef MULTI_BYTE
                      if (is_dbcs
                      && ((dir < 0 && char_bytes == 2)
                      ||(dir > 0 && IsTrailByte(p, &p[col]))))
                      col -= dir;
                      #endif
                      }

                      Again, please check that this works correctly.

                      > For the performance, mb_byte1(), mb_byte2(), and IsTrailByte() should be
                      > avoided. In my patch submitted yesterday(screen.c patch for screen_line()),
                      > I can remove mb_isbyte1() call. I'll post the patch if the submitted one
                      > does not have any problem. (My new patch would be just a performance tuning.)

                      Performance for screen_line() is very important. It's used for all display
                      updating.

                      --
                      Microsoft's definition of a boolean: TRUE, FALSE, MAYBE
                      "Embrace and extend"...?

                      /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
                      \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
                    • Park Chong-Dae
                      ... It seems work correctly. ... If performance is most important, how about to make IsLeadByte as a MACRO function? In WIN32 or BROKEN_LOCALE, It could be
                      Message 10 of 11 , Mar 25, 2000
                      • 0 Attachment
                        On Sat, Mar 25, 2000 at 07:54:01PM +0100, Bram Moolenaar wrote:
                        > Well, when using long lines this can be important anyway. And it's easy to
                        > make it like this:
                        >
                        > if (type)
                        > {
                        > /* backup to before the character (possibly double-byte) */
                        > col -= dir;
                        > #ifdef MULTI_BYTE
                        > if (is_dbcs
                        > && ((dir < 0 && char_bytes == 2)
                        > ||(dir > 0 && IsTrailByte(p, &p[col]))))
                        > col -= dir;
                        > #endif
                        > }
                        >
                        > Again, please check that this works correctly.

                        It seems work correctly.

                        > > For the performance, mb_byte1(), mb_byte2(), and IsTrailByte() should be
                        > > avoided. In my patch submitted yesterday(screen.c patch for screen_line()),
                        > > I can remove mb_isbyte1() call. I'll post the patch if the submitted one
                        > > does not have any problem. (My new patch would be just a performance tuning.)
                        >
                        > Performance for screen_line() is very important. It's used for all display
                        > updating.

                        If performance is most important, how about to make IsLeadByte as a MACRO
                        function? In WIN32 or BROKEN_LOCALE, It could be substituted as a MACRO.
                        And IsLeadByte is called most frequently in the MULTI_BYTE environment.

                        PS: Is there a safe and efficient way to guess IsLeadByte
                        without calling mblen()?


                        --
                        Chong-Dae Park
                        --
                        "I am a HAL 9000 computer, Production Number 3. I became operational
                        at the HAL Plant in Urbana, Illinois on January 12, 1997."
                        - spoken by HAL in the book 2001: A Space Odyssey, Arthur C. Clarke
                      • Bram Moolenaar
                        ... Would be possible. However, IsLeadByte() only takes one argument, and it does do a bit of work. Thus the overhead of the function call shouldn t be that
                        Message 11 of 11 , Mar 26, 2000
                        • 0 Attachment
                          Park Chong-Dae wrote:

                          > > Performance for screen_line() is very important. It's used for all display
                          > > updating.
                          >
                          > If performance is most important, how about to make IsLeadByte as a MACRO
                          > function? In WIN32 or BROKEN_LOCALE, It could be substituted as a MACRO.
                          > And IsLeadByte is called most frequently in the MULTI_BYTE environment.

                          Would be possible. However, IsLeadByte() only takes one argument, and it does
                          do a bit of work. Thus the overhead of the function call shouldn't be that
                          much. Just try to keep the number of calls to IsLeadByte() low.

                          --
                          FATHER: One day, lad, all this will be yours ...
                          PRINCE: What - the curtains?
                          "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

                          /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
                          \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
                        Your message has been successfully submitted and would be delivered to recipients shortly.