Loading ...
Sorry, an error occurred while loading the content.

2025Re: "breakat" non-English chars when set linebreak and wrap

Expand Messages
  • Bram Moolenaar
    Aug 27, 2005
    • 0 Attachment
      Yao G. Zhan wrote:

      > I have quite a few text files that is mixed with English and non-English
      > chars such as Chinese. Usually they are documents that have very long
      > lines that every line is a paragraph per se. So I use "set wrap". For
      > English text, I prefer "set linebreak" so that a word would not break at
      > the end of the screen line end. But VIM doens't work as I expected by
      > breaking the line at chars specified in "breakat", especially when with
      > Chinese text where a character is a word on its own. For example:
      >
      > set linebreak
      > set wrap
      >
      > now I have this text in a long line (I'll use X to represent a single
      > Chinese char in case you can't display it.)
      >
      > English begins. English ends. Chinese begins.XXXXXXXXX.
      >
      > Then I resize the window a bit narrower. This line should wrap like:
      >
      > English begins. English ends. Chinese begins.XXXXX
      > XXXX.
      >
      > This is because each Chinese char is a word on its own. I expect VIM to
      > break at Chinese chars as well as "breakat". But actually VIM wraps it
      > like:
      >
      > English begins. English ends. Chinese begins.
      > XXXXXXXXX.
      >
      > Although there are still enough space to display some Chinese chars
      > after the period sign "." in the first line.
      >
      > Is there any mean that I can do to make VIM work as I expect?

      I understand the problem. 'breakat' is a list of characters, thus it
      doesn't allow a regexp or character range. Adding all Chinese
      characters to it would make it much too long.

      Perhaps we could allow character ranges. But previously something like
      "[a-z]" would mean the characters "][az-". Perhaps doubling the square
      brackets isn't too bad: "[[a-z]]"? Otherwise a separate option could be
      used.

      Anyway, using a regexp here will certainly slow down processing.
      Currently a 256-entry lookup table is used to speedup processing. That
      won't work for multi-byte characters...

      --
      Nobody will ever need more than 640 kB RAM.
      -- Bill Gates, 1983
      Windows 98 requires 16 MB RAM.
      -- Bill Gates, 1999
      Logical conclusion: Nobody will ever need Windows 98.

      /// Bram Moolenaar -- Bram@... -- http://www.Moolenaar.net \\\
      /// Sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
      \\\ Project leader for A-A-P -- http://www.A-A-P.org ///
      \\\ Buy LOTR 3 and help AIDS victims -- http://ICCF.nl/lotr.html ///
    • Show all 4 messages in this topic