22166Re: A little help on look behinds
- Oct 17, 2011--- In firstname.lastname@example.org, "diodeom" <diomir@...> wrote:
> I'd guess you're running this pattern in the (Ctrl+R) dialog
> box instead of in a clip...
It was clear, however, why the expression fails at line #9. But the question was: Why doesn't it match line #10 and #11? Why doesn't the engine just skip the mismatch?
If we start anew from the beginning of line #10 then line #11 will be selected. But when starting from the beginning of the subject string the verb seems to nail the cursor to the beginning of line #8.
Moreover, it doesn't seem to be a matter of running it in the dialog box or in a clip. For example...
achieves only two matches as well: line #5 and #7 (I've omitted '\A' here because it doesn't change the result no matter if used in the dialog or a clip).
Well, I don't want to tax your patience too much with my slow-wittedness. Don't ask -- just be surprised! Obviously, that's the way the verb is designed to work. It prevents the engine from making any further attempt at all once it has failed at any position. I hope I've learned the lesson...
PS Also thanks to Sheri for her latest reply!
> Flo wrote:
> > --- In email@example.com, Sheri <silvermoonwoman@> wrote:
> > >
> > > (*COMMIT) says the rest of the pattern must match from here without
> > > backtracking...I guess you could say it creates an anchor in
> > > the middle of the pattern.
> > Sheri,
> > I would be grateful for some more explanations about that verb '(*COMMIT).
> > I've tested your clip...
> > (?s)\A.+?\R\K(?=\]\]\]|\[\[\[)(*COMMIT)(?-s)\]\]\].*\R
> > against the following text which is quite similar to John's first sample. For our discussion, I've added line numbers (to be removed when testing):
> > 1 First line
> > 2
> > 3 [valid line]
> > 4
> > 5 ]]] remove
> > 6
> > 7 ]]] remove
> > 8
> > 9 [[[ valid line
> > 10 more valid lines
> > 11 ]]] valid line.
> > 12
> > 13 [[[ valid line
> > It's quite clear for me why the clip removes line #5 and #7 but not #9. But I still can't see why it doesn't remove line #11.
> > If we omit the '\K' we can see two matches:
> > - 1. from start of string to end of line #5
> > - 2. line #6 till end of line #7
> > Next, line #8 and #9 are not matched because line #9 doesn't start with ']]]'.
> > But WHY doesn't the clip jump over that mismatch and moves on selecting line #10 and #11? IMHO, line #10 should be matched with '(?s)\A.+?\R\K(?=\]\]\]|\[\[\[)' (with or without '\A'), and the following '(?-s)\]\]\].*\R'. Why on earth is '(*COMMIT)' preventing this?
> > Thanks for any light you can shed on this!
> I'd guess you're running this pattern in the (Ctrl+R) dialog box instead of in a clip -- where it's meant to ***capture or fail*** only once (on the very first instance of either [[[ or ]]]).
> If you click "Find Next" after #5 and #7, notice that your beginning position for the next attempt is on or after line #7. After the first available alternative "[[[" is spotted by the look-ahead now on line #9, (*COMMIT) demands that at this very location either "]]]" should be found or else the whole pattern should abandon any further matching attempts. Obviously, "[[[" ain't the required "]]]" so the pattern fails by design.
- << Previous post in topic Next post in topic >>