Loading ...
Sorry, an error occurred while loading the content.

[vim-multibyte] Re: new eval feature for multibyte

Expand Messages
  • Taro Muraoka
    ... Sorry, I missed at send. ... Taro Muraoka Explanation: Functions IsLeadByte and IsTrailByte eval interface. These make easy to
    Message 1 of 4 , Mar 12, 2000
    • 0 Attachment
      > Hello, vim-multibyte.
      >
      > I add new two eval functions. These work as interface to
      > IsLeadByte and IsTrailByte C function. It make easy to create
      > script that treat multibyte for multibyte-user.
      >
      > I made patches for src/eval.c, for runtime/doc/eval.txt, and
      > sample script Japanese folding.
      > ----
      > Taro Muraoka <koron@...>

      Sorry, I missed at send.
      ----
      Taro Muraoka <koron@...>



      Explanation: Functions IsLeadByte and IsTrailByte eval interface. These
      make easy to create script that treat multibyte for multibyte-user.
      Files: src/eval.c runtime/doc/eval.txt runtime/macros/jfold.vim
      --------

      *** ./src.orig/eval.c Thu Jan 13 05:14:54 2000
      --- ./src/eval.c Wed Feb 09 14:24:48 2000
      ***************
      *** 180,185 ****
      --- 180,187 ----
      static void f_hlID __ARGS((VAR argvars, VAR retvar));
      static void f_hostname __ARGS((VAR argvars, VAR retvar));
      static void f_isdirectory __ARGS((VAR argvars, VAR retvar));
      + static void f_isleadbyte __ARGS((VAR argvars, VAR retvar));
      + static void f_istrailbyte __ARGS((VAR argvars, VAR retvar));
      static void f_input __ARGS((VAR argvars, VAR retvar));
      static void f_last_buffer_nr __ARGS((VAR argvars, VAR retvar));
      static void f_libcall __ARGS((VAR argvars, VAR retvar));
      ***************
      *** 1870,1875 ****
      --- 1872,1879 ----
      {"hostname", 0, 0, f_hostname},
      {"input", 1, 1, f_input},
      {"isdirectory", 1, 1, f_isdirectory},
      + {"isleadbyte", 1, 2, f_isleadbyte},
      + {"istrailbyte", 1, 2, f_istrailbyte},
      {"last_buffer_nr", 0, 0, f_last_buffer_nr},/* obsolete */
      {"libcall", 3, 3, f_libcall},
      {"line", 1, 1, f_line},
      ***************
      *** 3335,3340 ****
      --- 3339,3407 ----
      VAR retvar;
      {
      retvar->var_val.var_number = mch_isdir(get_var_string(&argvars[0]));
      + }
      +
      + /*
      + * "isleadbyte()" function
      + */
      + static void
      + f_isleadbyte(argvars, retvar)
      + VAR argvars;
      + VAR retvar;
      + {
      + #ifdef MULTI_BYTE
      + char *p;
      + int len, offset = 0;
      +
      + if (!is_dbcs)
      + {
      + retvar->var_val.var_number = 0;
      + return;
      + }
      + p = get_var_string(&argvars[0]);
      + len = STRLEN(p) - 1;
      + if (argvars[1].var_type != VAR_UNKNOWN)
      + offset = (int)get_var_number(&argvars[1]);
      + if (offset > len)
      + offset = len;
      + else if (offset < 0)
      + offset = 0;
      + retvar->var_val.var_number =
      + IsLeadByte(get_var_string(&argvars[0])[offset]);
      + #else
      + retvar->var_val.var_number = 0;
      + #endif
      + }
      +
      + /*
      + * "istrailbyte()" function
      + */
      + static void
      + f_istrailbyte(argvars, retvar)
      + VAR argvars;
      + VAR retvar;
      + {
      + #ifdef MULTI_BYTE
      + char *p;
      + int len, offset;
      +
      + if (!is_dbcs)
      + {
      + retvar->var_val.var_number = 0;
      + return;
      + }
      + p = get_var_string(&argvars[0]);
      + len = offset = STRLEN(p) - 1;
      + if (argvars[1].var_type != VAR_UNKNOWN)
      + offset = (int)get_var_number(&argvars[1]);
      + if (offset > len)
      + offset = len;
      + else if (offset < 0)
      + offset = 0;
      + retvar->var_val.var_number = IsTrailByte(p, p + offset);
      + #else
      + retvar->var_val.var_number = 0;
      + #endif
      }

      /*
      *** runtime/doc.orig/eval.txt Sun Jan 16 22:12:54 2000
      --- runtime/doc/eval.txt Tue Feb 22 16:59:06 2000
      ***************
      *** 478,483 ****
      --- 478,485 ----
      hostname() String name of the machine vim is running on
      input( {prompt}) String get input from the user
      isdirectory( {directory}) Number TRUE if {directory} is a directory
      + isleadbyte( {str} [, {index}]) Number TRUE if {str} at {index} leadbyte
      + istrailbyte( {str} [, {index}]) Number TRUE if {str} at {index} trail
      libcall( {lib}, {func}, {arg} String call {func} in library {lib}
      line( {expr}) Number line nr of cursor, last line or mark
      line2byte( {lnum}) Number byte count of line {lnum}
      ***************
      *** 986,991 ****
      --- 988,1015 ----
      the name {directory} exists. If {directory} doesn't exist, or
      isn't a directory, the result is FALSE. {directory} is any
      expression, which is used as a String.
      +
      + *isleadbyte()*
      + isleadbyte({str} [,{index}])
      + The result is a Number, which is TRUE when the {index}'th
      + single character from {str} is a leadbyte character of
      + multibyte character. If the character is not a leadbyte, the
      + result is FALSE. If 'fileencoding' was not DBCS's one, the
      + result is always FALSE. This test uses IsLeadByte() function.
      + When {index} is omitted, that means the first character of
      + {str} are tested.
      + {only |+multi_byte|, or alway return FALSE}
      +
      + *istrailbyte()*
      + istrailbyte({str} [,{index}])
      + The result is a Number, which is TRUE when the {index}'th
      + single character from {str} is a trailbyte character of
      + multibyte character. If the character is not a trailbyte, the
      + result is FALSE. If 'fileencoding' was not DBCS's one, the
      + result is always FALSE. This test uses IsTrailByte()
      + function. When {index} is omitted, that means the last
      + character of {str} are tested.
      + {only |+multi_byte|, or alway return FALSE}

      *libcall()*
      libcall({libname}, {funcname}, {argument})
      diff -crN runtime/macros.orig/jfold.vim runtime/macros/jfold.vim
      *** runtime/macros.orig/jfold.vim Thu Jan 01 09:00:00 1970
      --- runtime/macros/jfold.vim Thu Feb 10 09:29:06 2000
      ***************
      *** 0 ****
      --- 1,144 ----
      + " jfold.vim - Japanese folding script
      + "
      + " Maintainer: Taro Muraoka <koron@...>
      + " Last change: 09:29:05 10-Feb-2000.
      + "
      + " ':source jfold.vim' and in VISUAL MODE type 'x'... folding!!
      +
      + let $jf_no_top = ')]},.'
      + let $jf_no_top = $jf_no_top.')〕]}〉》」』】'
      + let $jf_no_top = $jf_no_top.'ぁぃぅぇぉっゃゅょァィゥェォッャュョー'
      + let $jf_no_top = $jf_no_top.'。、,.'
      + "let $jf_no_top = $jf_no_top.''
      + let $jf_no_end = '([{'
      + let $jf_no_end = $jf_no_end.'(〔[{〈《「『【'
      + "let $jf_no_end = $jf_no_end.''
      + let $jf_wordtop = "^[0-9a-zA-Z_(\\[{'\"]"
      + let $jf_wordend = "[0-9a-zA-Z_)\\]}'\",.?!]$"
      + let $jf_loop = 3
      +
      + "
      + " JapaneseFoldTabooRule(str, len)
      + " Return length must be line.
      + "
      + func! JapaneseFoldTabooRule(str, len)
      + let len = a:len
      + let flag = 0
      + let i = 0
      + while i < $jf_loop
      + " no end rule
      + let ch = MultibyteGetChar(a:str, len - 1)
      + if flag || $jf_no_end =~ ch
      + let flag = 0
      + let len = len - strlen(ch)
      + if len < 1
      + let len = 1
      + return len
      + endif
      + endif
      + " no top rule
      + let ch = MultibyteGetChar(a:str, len)
      + if $jf_no_top =~ ch
      + if len <= &textwidth
      + let len = len + strlen(ch)
      + else
      + let flag = 1
      + endif
      + endif
      + let i = i + 1
      + endw
      + return len
      + endfunc
      +
      + "
      + " MultibyteGetChar(str, offset)
      + " Get a (single or multibyte) character from str[offset]. Offset can pointed
      + " lead or trail byte of multibyte character. It works correct.
      + "
      + func! MultibyteGetChar(str, offset)
      + " does offset point trailbyte?
      + if a:offset > 0 && istrailbyte(a:str, a:offset)
      + return strpart(a:str, a:offset - 1, 2)
      + " or multibyte?
      + elseif isleadbyte(a:str, a:offset)
      + return strpart(a:str, a:offset, 2)
      + else
      + return strpart(a:str, a:offset, 1)
      + endif
      + endfunc
      +
      + "
      + " MultibyteFold(startline, endline)
      + " Fold specified area of text between startline and endline with specified
      + " folding rule. Default folding rule defined JapaneseFoldTabooRule. So
      + " this work as Japanese folding.
      + "
      + func! MultibyteFold(startline, endline)
      + " check text width
      + if &textwidth <= 0
      + echohl ErrorMessage
      + echo 'Invalid textwidth (='.&textwidth.')'
      + echohl None
      + return
      + endif
      + " check fold range
      + if a:startline > a:endline
      + echohl ErrorMessage
      + echo 'Invalid range'
      + echohl None
      + endif
      + " get visual selected text as single line string.
      + let text = ''
      + let i = a:startline
      + while i <= a:endline
      + let str = getline(i)
      + " remove lead and trail white spaces
      + let str = substitute(str, '^\s\+', '', '')
      + let str = substitute(str, '\s\+$', '', '')
      + " if it seems word, add a white space
      + if str =~ $jf_wordtop && text =~ $jf_wordend
      + let str = " " . str
      + endif
      + let text = text . str
      + let i = i + 1
      + endw
      + " remove selected area and insert new folded text
      + execute 'normal :'.a:startline.','.a:endline."d\<CR>"
      + let len = 0
      + let i = a:startline - 1
      + while len < strlen(text)
      + if len >= &textwidth
      + " english word folding
      + let ch = MultibyteGetChar(text, len - 1)
      + if ch =~ '^\w$' && MultibyteGetChar(text, len) =~ '^\w$'
      + while MultibyteGetChar(text, len - 1) =~ '^\w$'
      + let len = len - 1
      + endw
      + endif
      + " this function call made extra folding rule
      + let len = JapaneseFoldTabooRule(text, len)
      + " insert new one line without any trail white spaces
      + call append(i,
      + \substitute(strpart(text, 0, len), '\s\+$', '', ''))
      + let i = i + 1
      + let text = strpart(text, len, strlen(text))
      + " do not lead any white spaces in new line
      + let text = substitute(text, '^\s\+', '', '')
      + let len = 0
      + endif
      + " 1 letter (not 1 byte) next
      + if isleadbyte(text, len)
      + let len = len + 2
      + else
      + let len = len + 1
      + endif
      + endw
      + if len > 0
      + " insert remained text
      + call append(i, text)
      + endif
      + endfunc
      +
      + " hot key is 'x' in VISUAL MODE
      + vnoremap x <ESC>:call MultibyteFold(line("'<"), line("'>"))<CR>
      + " vi:ts=8 sts=2 sw=2 tw=0
    • Bram Moolenaar
      ... Although these functions are probably useful, I would like to make them more generic. In Vim version 6.0 more multi-byte encodings will be supported,
      Message 2 of 4 , Mar 20, 2000
      • 0 Attachment
        Taro Muraoka wrote:

        > I add new two eval functions. These work as interface to
        > IsLeadByte and IsTrailByte C function. It make easy to create
        > script that treat multibyte for multibyte-user.

        Although these functions are probably useful, I would like to make them more
        generic. In Vim version 6.0 more multi-byte encodings will be supported,
        which use more than two bytes for a character. UTF-8 uses a variable number
        of bytes for a character.

        What would be a more generic function? Perhaps one function that returns the
        index of a multi-byte character:
        0 if not on a multi-byte character
        1 if on the first byte of a multi-byte character
        2 if on the second byte of a multi-byte character
        etc.

        Would that be useful?

        --
        hundred-and-one symptoms of being an internet addict:
        189. You put your e-mail address in the upper left-hand corner of envelopes.

        /-/-- Bram Moolenaar --- Bram@... --- http://www.moolenaar.net --\-\
        \-\-- Vim: http://www.vim.org ---- ICCF Holland: http://www.vim.org/iccf --/-/
      • Taro Muraoka
        ... Yes, it seems be useful. ... Taro Muraoka koron@tka.att.ne.jp
        Message 3 of 4 , Mar 20, 2000
        • 0 Attachment
          Bram Moolenaar:

          > What would be a more generic function? Perhaps one function that returns the
          > index of a multi-byte character:
          > 0 if not on a multi-byte character
          > 1 if on the first byte of a multi-byte character
          > 2 if on the second byte of a multi-byte character
          > etc.
          >
          > Would that be useful?

          Yes, it seems be useful.
          ----
          Taro Muraoka koron@...
        Your message has been successfully submitted and would be delivered to recipients shortly.