Loading ...
Sorry, an error occurred while loading the content.

[vim-multibyte] new eval feature for multibyte

Expand Messages
  • Taro Muraoka
    Hello, vim-multibyte. I add new two eval functions. These work as interface to IsLeadByte and IsTrailByte C function. It make easy to create script that treat
    Message 1 of 4 , Mar 12, 2000
    • 0 Attachment
      Hello, vim-multibyte.

      I add new two eval functions. These work as interface to
      IsLeadByte and IsTrailByte C function. It make easy to create
      script that treat multibyte for multibyte-user.

      I made patches for src/eval.c, for runtime/doc/eval.txt, and
      sample script Japanese folding.
      ----
      Taro Muraoka <koron@...>
    • Taro Muraoka
      ... Sorry, I missed at send. ... Taro Muraoka Explanation: Functions IsLeadByte and IsTrailByte eval interface. These make easy to
      Message 2 of 4 , Mar 12, 2000
      • 0 Attachment
        > Hello, vim-multibyte.
        >
        > I add new two eval functions. These work as interface to
        > IsLeadByte and IsTrailByte C function. It make easy to create
        > script that treat multibyte for multibyte-user.
        >
        > I made patches for src/eval.c, for runtime/doc/eval.txt, and
        > sample script Japanese folding.
        > ----
        > Taro Muraoka <koron@...>

        Sorry, I missed at send.
        ----
        Taro Muraoka <koron@...>



        Explanation: Functions IsLeadByte and IsTrailByte eval interface. These
        make easy to create script that treat multibyte for multibyte-user.
        Files: src/eval.c runtime/doc/eval.txt runtime/macros/jfold.vim
        --------

        *** ./src.orig/eval.c Thu Jan 13 05:14:54 2000
        --- ./src/eval.c Wed Feb 09 14:24:48 2000
        ***************
        *** 180,185 ****
        --- 180,187 ----
        static void f_hlID __ARGS((VAR argvars, VAR retvar));
        static void f_hostname __ARGS((VAR argvars, VAR retvar));
        static void f_isdirectory __ARGS((VAR argvars, VAR retvar));
        + static void f_isleadbyte __ARGS((VAR argvars, VAR retvar));
        + static void f_istrailbyte __ARGS((VAR argvars, VAR retvar));
        static void f_input __ARGS((VAR argvars, VAR retvar));
        static void f_last_buffer_nr __ARGS((VAR argvars, VAR retvar));
        static void f_libcall __ARGS((VAR argvars, VAR retvar));
        ***************
        *** 1870,1875 ****
        --- 1872,1879 ----
        {"hostname", 0, 0, f_hostname},
        {"input", 1, 1, f_input},
        {"isdirectory", 1, 1, f_isdirectory},
        + {"isleadbyte", 1, 2, f_isleadbyte},
        + {"istrailbyte", 1, 2, f_istrailbyte},
        {"last_buffer_nr", 0, 0, f_last_buffer_nr},/* obsolete */
        {"libcall", 3, 3, f_libcall},
        {"line", 1, 1, f_line},
        ***************
        *** 3335,3340 ****
        --- 3339,3407 ----
        VAR retvar;
        {
        retvar->var_val.var_number = mch_isdir(get_var_string(&argvars[0]));
        + }
        +
        + /*
        + * "isleadbyte()" function
        + */
        + static void
        + f_isleadbyte(argvars, retvar)
        + VAR argvars;
        + VAR retvar;
        + {
        + #ifdef MULTI_BYTE
        + char *p;
        + int len, offset = 0;
        +
        + if (!is_dbcs)
        + {
        + retvar->var_val.var_number = 0;
        + return;
        + }
        + p = get_var_string(&argvars[0]);
        + len = STRLEN(p) - 1;
        + if (argvars[1].var_type != VAR_UNKNOWN)
        + offset = (int)get_var_number(&argvars[1]);
        + if (offset > len)
        + offset = len;
        + else if (offset < 0)
        + offset = 0;
        + retvar->var_val.var_number =
        + IsLeadByte(get_var_string(&argvars[0])[offset]);
        + #else
        + retvar->var_val.var_number = 0;
        + #endif
        + }
        +
        + /*
        + * "istrailbyte()" function
        + */
        + static void
        + f_istrailbyte(argvars, retvar)
        + VAR argvars;
        + VAR retvar;
        + {
        + #ifdef MULTI_BYTE
        + char *p;
        + int len, offset;
        +
        + if (!is_dbcs)
        + {
        + retvar->var_val.var_number = 0;
        + return;
        + }
        + p = get_var_string(&argvars[0]);
        + len = offset = STRLEN(p) - 1;
        + if (argvars[1].var_type != VAR_UNKNOWN)
        + offset = (int)get_var_number(&argvars[1]);
        + if (offset > len)
        + offset = len;
        + else if (offset < 0)
        + offset = 0;
        + retvar->var_val.var_number = IsTrailByte(p, p + offset);
        + #else
        + retvar->var_val.var_number = 0;
        + #endif
        }

        /*
        *** runtime/doc.orig/eval.txt Sun Jan 16 22:12:54 2000
        --- runtime/doc/eval.txt Tue Feb 22 16:59:06 2000
        ***************
        *** 478,483 ****
        --- 478,485 ----
        hostname() String name of the machine vim is running on
        input( {prompt}) String get input from the user
        isdirectory( {directory}) Number TRUE if {directory} is a directory
        + isleadbyte( {str} [, {index}]) Number TRUE if {str} at {index} leadbyte
        + istrailbyte( {str} [, {index}]) Number TRUE if {str} at {index} trail
        libcall( {lib}, {func}, {arg} String call {func} in library {lib}
        line( {expr}) Number line nr of cursor, last line or mark
        line2byte( {lnum}) Number byte count of line {lnum}
        ***************
        *** 986,991 ****
        --- 988,1015 ----
        the name {directory} exists. If {directory} doesn't exist, or
        isn't a directory, the result is FALSE. {directory} is any
        expression, which is used as a String.
        +
        + *isleadbyte()*
        + isleadbyte({str} [,{index}])
        + The result is a Number, which is TRUE when the {index}'th
        + single character from {str} is a leadbyte character of
        + multibyte character. If the character is not a leadbyte, the
        + result is FALSE. If 'fileencoding' was not DBCS's one, the
        + result is always FALSE. This test uses IsLeadByte() function.
        + When {index} is omitted, that means the first character of
        + {str} are tested.
        + {only |+multi_byte|, or alway return FALSE}
        +
        + *istrailbyte()*
        + istrailbyte({str} [,{index}])
        + The result is a Number, which is TRUE when the {index}'th
        + single character from {str} is a trailbyte character of
        + multibyte character. If the character is not a trailbyte, the
        + result is FALSE. If 'fileencoding' was not DBCS's one, the
        + result is always FALSE. This test uses IsTrailByte()
        + function. When {index} is omitted, that means the last
        + character of {str} are tested.
        + {only |+multi_byte|, or alway return FALSE}

        *libcall()*
        libcall({libname}, {funcname}, {argument})
        diff -crN runtime/macros.orig/jfold.vim runtime/macros/jfold.vim
        *** runtime/macros.orig/jfold.vim Thu Jan 01 09:00:00 1970
        --- runtime/macros/jfold.vim Thu Feb 10 09:29:06 2000
        ***************
        *** 0 ****
        --- 1,144 ----
        + " jfold.vim - Japanese folding script
        + "
        + " Maintainer: Taro Muraoka <koron@...>
        + " Last change: 09:29:05 10-Feb-2000.
        + "
        + " ':source jfold.vim' and in VISUAL MODE type 'x'... folding!!
        +
        + let $jf_no_top = ')]},.'
        + let $jf_no_top = $jf_no_top.')〕]}〉》」』】'
        + let $jf_no_top = $jf_no_top.'ぁぃぅぇぉっゃゅょァィゥェォッャュョー'
        + let $jf_no_top = $jf_no_top.'。、,.'
        + "let $jf_no_top = $jf_no_top.''
        + let $jf_no_end = '([{'
        + let $jf_no_end = $jf_no_end.'(〔[{〈《「『【'
        + "let $jf_no_end = $jf_no_end.''
        + let $jf_wordtop = "^[0-9a-zA-Z_(\\[{'\"]"
        + let $jf_wordend = "[0-9a-zA-Z_)\\]}'\",.?!]$"
        + let $jf_loop = 3
        +
        + "
        + " JapaneseFoldTabooRule(str, len)
        + " Return length must be line.
        + "
        + func! JapaneseFoldTabooRule(str, len)
        + let len = a:len
        + let flag = 0
        + let i = 0
        + while i < $jf_loop
        + " no end rule
        + let ch = MultibyteGetChar(a:str, len - 1)
        + if flag || $jf_no_end =~ ch
        + let flag = 0
        + let len = len - strlen(ch)
        + if len < 1
        + let len = 1
        + return len
        + endif
        + endif
        + " no top rule
        + let ch = MultibyteGetChar(a:str, len)
        + if $jf_no_top =~ ch
        + if len <= &textwidth
        + let len = len + strlen(ch)
        + else
        + let flag = 1
        + endif
        + endif
        + let i = i + 1
        + endw
        + return len
        + endfunc
        +
        + "
        + " MultibyteGetChar(str, offset)
        + " Get a (single or multibyte) character from str[offset]. Offset can pointed
        + " lead or trail byte of multibyte character. It works correct.
        + "
        + func! MultibyteGetChar(str, offset)
        + " does offset point trailbyte?
        + if a:offset > 0 && istrailbyte(a:str, a:offset)
        + return strpart(a:str, a:offset - 1, 2)
        + " or multibyte?
        + elseif isleadbyte(a:str, a:offset)
        + return strpart(a:str, a:offset, 2)
        + else
        + return strpart(a:str, a:offset, 1)
        + endif
        + endfunc
        +
        + "
        + " MultibyteFold(startline, endline)
        + " Fold specified area of text between startline and endline with specified
        + " folding rule. Default folding rule defined JapaneseFoldTabooRule. So
        + " this work as Japanese folding.
        + "
        + func! MultibyteFold(startline, endline)
        + " check text width
        + if &textwidth <= 0
        + echohl ErrorMessage
        + echo 'Invalid textwidth (='.&textwidth.')'
        + echohl None
        + return
        + endif
        + " check fold range
        + if a:startline > a:endline
        + echohl ErrorMessage
        + echo 'Invalid range'
        + echohl None
        + endif
        + " get visual selected text as single line string.
        + let text = ''
        + let i = a:startline
        + while i <= a:endline
        + let str = getline(i)
        + " remove lead and trail white spaces
        + let str = substitute(str, '^\s\+', '', '')
        + let str = substitute(str, '\s\+$', '', '')
        + " if it seems word, add a white space
        + if str =~ $jf_wordtop && text =~ $jf_wordend
        + let str = " " . str
        + endif
        + let text = text . str
        + let i = i + 1
        + endw
        + " remove selected area and insert new folded text
        + execute 'normal :'.a:startline.','.a:endline."d\<CR>"
        + let len = 0
        + let i = a:startline - 1
        + while len < strlen(text)
        + if len >= &textwidth
        + " english word folding
        + let ch = MultibyteGetChar(text, len - 1)
        + if ch =~ '^\w$' && MultibyteGetChar(text, len) =~ '^\w$'
        + while MultibyteGetChar(text, len - 1) =~ '^\w$'
        + let len = len - 1
        + endw
        + endif
        + " this function call made extra folding rule
        + let len = JapaneseFoldTabooRule(text, len)
        + " insert new one line without any trail white spaces
        + call append(i,
        + \substitute(strpart(text, 0, len), '\s\+$', '', ''))
        + let i = i + 1
        + let text = strpart(text, len, strlen(text))
        + " do not lead any white spaces in new line
        + let text = substitute(text, '^\s\+', '', '')
        + let len = 0
        + endif
        + " 1 letter (not 1 byte) next
        + if isleadbyte(text, len)
        + let len = len + 2
        + else
        + let len = len + 1
        + endif
        + endw
        + if len > 0
        + " insert remained text
        + call append(i, text)
        + endif
        + endfunc
        +
        + " hot key is 'x' in VISUAL MODE
        + vnoremap x <ESC>:call MultibyteFold(line("'<"), line("'>"))<CR>
        + " vi:ts=8 sts=2 sw=2 tw=0
      • Bram Moolenaar
        ... Although these functions are probably useful, I would like to make them more generic. In Vim version 6.0 more multi-byte encodings will be supported,
        Message 3 of 4 , Mar 20, 2000
        • 0 Attachment
          Taro Muraoka wrote:

          > I add new two eval functions. These work as interface to
          > IsLeadByte and IsTrailByte C function. It make easy to create
          > script that treat multibyte for multibyte-user.

          Although these functions are probably useful, I would like to make them more
          generic. In Vim version 6.0 more multi-byte encodings will be supported,
          which use more than two bytes for a character. UTF-8 uses a variable number
          of bytes for a character.

          What would be a more generic function? Perhaps one function that returns the
          index of a multi-byte character:
          0 if not on a multi-byte character
          1 if on the first byte of a multi-byte character
          2 if on the second byte of a multi-byte character
          etc.

          Would that be useful?

          --
          hundred-and-one symptoms of being an internet addict:
          189. You put your e-mail address in the upper left-hand corner of envelopes.

          /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
          \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
        • Taro Muraoka
          ... Yes, it seems be useful. ... Taro Muraoka koron@tka.att.ne.jp
          Message 4 of 4 , Mar 20, 2000
          • 0 Attachment
            Bram Moolenaar:

            > What would be a more generic function? Perhaps one function that returns the
            > index of a multi-byte character:
            > 0 if not on a multi-byte character
            > 1 if on the first byte of a multi-byte character
            > 2 if on the second byte of a multi-byte character
            > etc.
            >
            > Would that be useful?

            Yes, it seems be useful.
            ----
            Taro Muraoka koron@...
          Your message has been successfully submitted and would be delivered to recipients shortly.