Loading ...
Sorry, an error occurred while loading the content.

Re: how to match all Chinese chars?

Expand Messages
  • Chris Jones
    On Mon, May 28, 2012 at 05:55:27PM EDT, Tony Mechelynck wrote: [..] ... The limit is that a range of characters (a-z, 0-9 etc...) that is part of a collection
    Message 1 of 10 , May 30, 2012
    • 0 Attachment
      On Mon, May 28, 2012 at 05:55:27PM EDT, Tony Mechelynck wrote:

      [..]

      > However, there is also a limitation in Vim, namely, a collection can
      > only match (IIRC) at most 257 different individual characters at the
      > same point. 4E00..9FFF alone is already much more than that.

      The limit is that a range of characters (a-z, 0-9 etc...) that is part
      of a collection can only match at most 256 characters.

      Here's for instance a valid collection that matches 4096 characters:

      | /[一-仿伀-俿倀-僿儀-凿刀-勿匀-叿吀-哿唀-嗿嘀-囿圀-埿堀-壿夀-姿娀-嫿嬀-寿尀-峿崀-帀]

      Subranges are: 4e00-4eff ... 5d00-5dff - 256 characters each.

      Conversely, the following triggers the ‘E16 Invalid range’ error:

      | /[一-企]

      Range is: 4e00-4f01

      I generated a similar collection for the entire 4e00-9fff block, split
      into 256-character sub-ranges, and apart from the regex causing Vim to
      slow down to a crawl on larger files, it appeared to match.

      All the same, there does not appear to be any simple solutions save for
      this clunky workaround.

      Is anything in the works regarding unicode regex support in a future
      release of Vim (8.x)..?

      CJ

      --
      WE GET SIGNAL

      --
      You received this message from the "vim_use" maillist.
      Do not top-post! Type your reply below the text you are replying to.
      For more information, visit http://www.vim.org/maillist.php
    Your message has been successfully submitted and would be delivered to recipients shortly.