Loading ...
Sorry, an error occurred while loading the content.

[vim-multibyte] Re: multibyte patch

Expand Messages
  • Park Chong-Dae
    ... How about this patch? Chong-Dae Park ... +++ search.c Fri Mar 24 20:11:25 2000 @@ -1090,18 +1090,16 @@ { for (;;) { - if ((col += dir) =
    Message 1 of 11 , Mar 24, 2000
    • 0 Attachment
      On Fri, Mar 24, 2000 at 11:52:16AM +0100, Bram Moolenaar wrote:
      > The idea is that, when moving forward, when the cursor is currently on a
      > leading byte, one extra column is skipped to avoid the trailing byte. When
      > moving backwards, a trailing byte is detected and skipped (this didn't change).

      How about this patch?

      Chong-Dae Park
    • Taro Muraoka
      Hello, ... I understand too. And test Bram s patch, it works very well. This bug is a my mistake. Sorry all. ... Taro Muraoka
      Message 2 of 11 , Mar 24, 2000
      • 0 Attachment
        Hello,

        > Sung-Hyun Nam wrote:
        >
        > > Guess all the char is multibyte char.
        > >
        > > XY XY XY
        > >
        > > Guess cursor is on the first X. Now type 'f '.
        > > by 'col += dir', it moves first 'Y'.
        > > Because vim checks 'dir < 0' for trail byte,
        > > Now cursor moves to ' ' by IsLeadByte().
        > > by 'col += dir', cursor moves to 2nd 'X' (skips first ' ').
        > > by IsLeadByte, cursor moves to second 'Y'.
        > > by col += dir, cursor moves to second ' '.
        > > and break by p[col]==c.
        >
        > I understand the problem now. How about this patch to fix it:

        I understand too. And test Bram's patch, it works very well.
        This bug is a my mistake. Sorry all.
        ----
        Taro Muraoka <koron@...>
      • Park Chong-Dae
        ... Oops.. I forgot the problem. However Bram s code is more efficient than mine. Because IsLeadByte() is faster than IsTrailByte(). So I make another patch.
        Message 3 of 11 , Mar 24, 2000
        • 0 Attachment
          On Sat, Mar 25, 2000 at 12:57:35AM +0900, Sung-Hyun Nam wrote:
          > Your patch yet has the problem that Bram pointed out...
          > I think, 'col' could be advanced beyond of the line for MULTI_BYTE.

          Oops.. I forgot the problem.
          However Bram's code is more efficient than mine. Because IsLeadByte() is
          faster than IsTrailByte(). So I make another patch.
          This is REAL multibyte patch for searchc().
          Now, you can search DBCS using f or F correctly.

          Chong-Dae Park
        • Bram Moolenaar
          ... This looks OK to me. But let s hear it from people actually using double-byte characters. ... I think this can be changed to: if (type) { /* backup to
          Message 4 of 11 , Mar 24, 2000
          • 0 Attachment
            Park Chong-Dae wrote:

            > Oops.. I forgot the problem.
            > However Bram's code is more efficient than mine. Because IsLeadByte() is
            > faster than IsTrailByte(). So I make another patch.
            > This is REAL multibyte patch for searchc().
            > Now, you can search DBCS using f or F correctly.

            This looks OK to me. But let's hear it from people actually using double-byte
            characters.

            One simplification that appears to be possible:

            > if (type)
            > + {
            > col -= dir;
            > +#ifdef MULTI_BYTE
            > + if (is_dbcs && IsTrailByte(p, &p[col]))
            > + col -= dir; /* skip multibyte's trail byte */
            > +#endif
            > + }

            I think this can be changed to:

            if (type)
            {
            /* backup to before the character (possibly double-byte) */
            col -= dir;
            #ifdef MULTI_BYTE
            if (char_bytes == 2)
            col -= dir;
            #endif
            }

            This avoids a IsTrailByte() call, which can be slow. But please check that
            this is correct.

            --
            He was not in the least bit scared to be mashed into a pulp
            Or to have his eyes gouged out and his elbows broken;
            To have his kneecaps split and his body burned away
            And his limbs all hacked and mangled, brave Sir Robin.
            "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

            /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
            \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
          • Park Chong-Dae
            ... I think it does not work. Suppose this setting. XY AB Guess you are on the first X . When typing the t command, it moves to space first. Now type ==
            Message 5 of 11 , Mar 25, 2000
            • 0 Attachment
              On Fri, Mar 24, 2000 at 09:31:18PM +0100, Bram Moolenaar wrote:
              > This looks OK to me. But let's hear it from people actually using double-byte
              > characters.
              >
              > One simplification that appears to be possible:
              >
              > > if (type)
              > > + {
              > > col -= dir;
              > > +#ifdef MULTI_BYTE
              > > + if (is_dbcs && IsTrailByte(p, &p[col]))
              > > + col -= dir; /* skip multibyte's trail byte */
              > > +#endif
              > > + }
              >
              > I think this can be changed to:
              >
              > if (type)
              > {
              > /* backup to before the character (possibly double-byte) */
              > col -= dir;
              > #ifdef MULTI_BYTE
              > if (char_bytes == 2)
              > col -= dir;
              > #endif
              > }
              >
              > This avoids a IsTrailByte() call, which can be slow. But please check that
              > this is correct.

              I think it does not work.

              Suppose this setting.

              XY AB

              Guess you are on the first 'X'.
              When typing the 't ' command, it moves to space first.
              Now "type == TURE", it moves to opposite direction(in this case, backward).
              The cursor goes to Y position, where is the second byte of "XY".
              (But char_bytes == 1). This occurs in forward search only. In backward search
              case('T' cmannd cases: the opposite movement is now forward direction),
              your code would work faster. However this routine is called only once for each
              call. So I choose the simple case. I think the performance is not important
              here.

              For the performance, mb_byte1(), mb_byte2(), and IsTrailByte() should be
              avoided. In my patch submitted yesterday(screen.c patch for screen_line()),
              I can remove mb_isbyte1() call. I'll post the patch if the submitted one
              does not have any problem. (My new patch would be just a performance tuning.)

              --
              Chong-Dae Park
              --
              Han Solo: That's 'cause droids don't pull people's arms out of their sockets
              when they lose. Wookiees are known to do that.
              C-3PO: I see your point, sir. I suggest a new strategy, R2! let the Wookiee
              win.
              - from "Star Wars: A New Hope"
            • Bram Moolenaar
              ... Thanks for checking. ... Well, when using long lines this can be important anyway. And it s easy to make it like this: if (type) { /* backup to before the
              Message 6 of 11 , Mar 25, 2000
              • 0 Attachment
                Park Chong-Dae wrote:

                > > This avoids a IsTrailByte() call, which can be slow. But please check that
                > > this is correct.
                >
                > I think it does not work.

                Thanks for checking.

                > XY AB
                >
                > Guess you are on the first 'X'.
                > When typing the 't ' command, it moves to space first.
                > Now "type == TURE", it moves to opposite direction(in this case, backward).
                > The cursor goes to Y position, where is the second byte of "XY".
                > (But char_bytes == 1). This occurs in forward search only. In backward search
                > case('T' cmannd cases: the opposite movement is now forward direction),
                > your code would work faster. However this routine is called only once for each
                > call. So I choose the simple case. I think the performance is not important
                > here.

                Well, when using long lines this can be important anyway. And it's easy to
                make it like this:

                if (type)
                {
                /* backup to before the character (possibly double-byte) */
                col -= dir;
                #ifdef MULTI_BYTE
                if (is_dbcs
                && ((dir < 0 && char_bytes == 2)
                ||(dir > 0 && IsTrailByte(p, &p[col]))))
                col -= dir;
                #endif
                }

                Again, please check that this works correctly.

                > For the performance, mb_byte1(), mb_byte2(), and IsTrailByte() should be
                > avoided. In my patch submitted yesterday(screen.c patch for screen_line()),
                > I can remove mb_isbyte1() call. I'll post the patch if the submitted one
                > does not have any problem. (My new patch would be just a performance tuning.)

                Performance for screen_line() is very important. It's used for all display
                updating.

                --
                Microsoft's definition of a boolean: TRUE, FALSE, MAYBE
                "Embrace and extend"...?

                /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
                \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
              • Park Chong-Dae
                ... It seems work correctly. ... If performance is most important, how about to make IsLeadByte as a MACRO function? In WIN32 or BROKEN_LOCALE, It could be
                Message 7 of 11 , Mar 25, 2000
                • 0 Attachment
                  On Sat, Mar 25, 2000 at 07:54:01PM +0100, Bram Moolenaar wrote:
                  > Well, when using long lines this can be important anyway. And it's easy to
                  > make it like this:
                  >
                  > if (type)
                  > {
                  > /* backup to before the character (possibly double-byte) */
                  > col -= dir;
                  > #ifdef MULTI_BYTE
                  > if (is_dbcs
                  > && ((dir < 0 && char_bytes == 2)
                  > ||(dir > 0 && IsTrailByte(p, &p[col]))))
                  > col -= dir;
                  > #endif
                  > }
                  >
                  > Again, please check that this works correctly.

                  It seems work correctly.

                  > > For the performance, mb_byte1(), mb_byte2(), and IsTrailByte() should be
                  > > avoided. In my patch submitted yesterday(screen.c patch for screen_line()),
                  > > I can remove mb_isbyte1() call. I'll post the patch if the submitted one
                  > > does not have any problem. (My new patch would be just a performance tuning.)
                  >
                  > Performance for screen_line() is very important. It's used for all display
                  > updating.

                  If performance is most important, how about to make IsLeadByte as a MACRO
                  function? In WIN32 or BROKEN_LOCALE, It could be substituted as a MACRO.
                  And IsLeadByte is called most frequently in the MULTI_BYTE environment.

                  PS: Is there a safe and efficient way to guess IsLeadByte
                  without calling mblen()?


                  --
                  Chong-Dae Park
                  --
                  "I am a HAL 9000 computer, Production Number 3. I became operational
                  at the HAL Plant in Urbana, Illinois on January 12, 1997."
                  - spoken by HAL in the book 2001: A Space Odyssey, Arthur C. Clarke
                • Bram Moolenaar
                  ... Would be possible. However, IsLeadByte() only takes one argument, and it does do a bit of work. Thus the overhead of the function call shouldn t be that
                  Message 8 of 11 , Mar 26, 2000
                  • 0 Attachment
                    Park Chong-Dae wrote:

                    > > Performance for screen_line() is very important. It's used for all display
                    > > updating.
                    >
                    > If performance is most important, how about to make IsLeadByte as a MACRO
                    > function? In WIN32 or BROKEN_LOCALE, It could be substituted as a MACRO.
                    > And IsLeadByte is called most frequently in the MULTI_BYTE environment.

                    Would be possible. However, IsLeadByte() only takes one argument, and it does
                    do a bit of work. Thus the overhead of the function call shouldn't be that
                    much. Just try to keep the number of calls to IsLeadByte() low.

                    --
                    FATHER: One day, lad, all this will be yours ...
                    PRINCE: What - the curtains?
                    "Monty Python and the Holy Grail" PYTHON (MONTY) PICTURES LTD

                    /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
                    \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
                  Your message has been successfully submitted and would be delivered to recipients shortly.