Loading ...
Sorry, an error occurred while loading the content.

Re: Large XML causes vim6a0 to crash when entering a " for a new attr ibute

Expand Messages
  • Matthew Winn
    ... Although it introduces another nested expression it also uses @ to prevent backtracking within that expression. The regular expression engine doesn t
    Message 1 of 5 , Aug 1 1:23 AM
    • 0 Attachment
      On Wed, Aug 01, 2001 at 08:34:33AM +0200, Johannes Zellner wrote:
      > On Mon, Jul 30, 2001 at 09:19:34AM +0100, Matthew Winn wrote:
      > > This should reduce some of the backtracking required by the expression
      > > (not fully tested because I don't have Vim6 on Windows; it seems to
      > > speed things up on Unix):
      > >
      > > \ #<[^ /!?>"']\(\_[^"'<>]\(\_[^"'<>]\+\)\@>\|"\_[^"]*"\|'\_[^']*'\)*/>#
      > > ^ ^^^^^^^^^^^^^^^^^^ ^
      >
      > I don't understand why this solves the problem. This introduces
      > another nested \+ .. *, doesn't it ? -- maybe I'll turn this
      > from a 'syn match' to a 'syn region' skipping the string
      > patterns. OTOH I always thought that 'match' is faster and
      > more robust than 'region'. Anyway, I'll try to find a solution ...
      > A sample file would be nice, I've also win2000 available for
      > testing.

      Although it introduces another nested expression it also uses \@> to
      prevent backtracking within that expression. The regular expression
      engine doesn't have to try as many different combinations of \+ and *:
      the \([...]\+\)\@> grabs as many characters as possible, but if the
      engine needs to backtrack it backtracks over the whole lot at once
      rather than reducing the matched text by one character and making
      another attempt.

      There's also an optimisation in the way it skips over characters which
      aren't in strings. As originally written it did...
      \(\_[^"'<>]\| ... \)*
      ...so for each character it had to dive into and out of the grouping.
      It's more efficient to gobble as many characters as possible each time
      the group is entered.

      As I mentioned in another message, the expression I gave is wrong. It
      matches two or more characters in the set \_[^"'<>] at once; it should
      of course be one or more. Vim didn't like it when I tried removing
      the unquantified \_[^"'<>] at the start, so I put it back but forgot
      to change the \+ to a * :-(

      --
      Matthew Winn (matthew@...)
    Your message has been successfully submitted and would be delivered to recipients shortly.