24120RE: RE: RE: RE: [Clip] REGEX Search Backward
- Nov 2, 2013
I tested on a regular document, and the following worked to capture the last URL.
^!Find "(?s)\A.+\Khttps?://[^\x20\r\n<>]+" AIORSW
Note the change in options. It repeatedly captures the last url regardless of where in the document the cursor is positioned. Note that if you have any html in your document, you should put a double quote in your negative class to prevent capturing trailing double quotes that are not part of the url.
Don't beat yourself up. You are the alpha smart guy here.
I tried your latest regex, and it worked, but only if I began the search from the start of the text. Is that what you intended?
I suppose it is reasonable that I cannot start the search from the end of the text, but I find it odd that I cannot start the search from anywhere before the last URL.
Greed and \K are not related, the \K only defines the start of the capture section. Oh, wait! I didn't specify the \A, and that's why it didn't work. Duh.
^!Find "(?s)\A.+\Khttps?://[^\x20\r\n<>]+" IORS
A lesson in regex - it can be pretty tricky, but it works. It is SO easy to overlook little things that make things not work as expected. I have stared at lines for a long time and not seen what is wrong - I call them dumb loops, and it happens to me far too often.
That worked, but only if I started my search anywhere above the last instance of a URL in the text.
A search for (?s)\A did not find the top of the text like I thought you had indicated. Are these regex options that must be specified before search criteria? Oh, wait - I wrote it without the \A! Dumb.
The search ^!Find "https?://[^\x20\r\n<>]+" IORS finds the next instance of a URL.
How does \K 'reset start of match' (defined in NTP regex help file) equate to greed?
Because regex is naturally 'greedy', it will always go to the last instance that meets the criteria in the absence of a control to prevent that. Also, you don't need commas with the options. I have never used that 'B' option, but it is not a regex expression, it is part of the scripting language of NoteTab. The regular expression engine is from pcre and is regularly updated, while the scripting language in NoteTab is independent of pcre other than the fact that it facilitates USING regex in its programming scripts.
^!Find "(?s).+\Khttps?://[^\x20\r\n<>]+" IORS
should work. That first .+ tells it to gather everything up to finding the term of interest, even if it means passing up that term multiple times. So it will find the last one. If you use an A option, it will only find the first one. (or you could use .+? to achieve the same goal). The \K says to ignore everything up to that last http. Assumes you only want to highlight/select that last URL.
I have a habit of always placing my options in alphabetical order and upper case, which enables me to always find a specific set of options, knowing that they will be in only one order, regardless of how many there are.
I do not understand how to use your suggestion. Perhaps you can provide an example. Here is what I attempted to do from the bottom of my outline topic text, last character:
^!Find "https?://[^\x20\r\n<>]+" R,I,O,S,B
; R: Specifies that the search criteria represents a regular expression.
; I: Ignores character case.
; O: Only searches in current outline topic.
; S: Silent search. NoteTab will not display any message box.
; B: Searches backwards.
How would you rewrite the Find command?
My objective is to find the last URL in the topic.
One way to do that is to start your search with (?s)\A, which is the beginning of the document. Then, the item you are looking for, then the last item that you are looking 'backwards' from. This is still a search down, only you have anchored it to start at the beginning of the doc and search down to your final term.
I am new to REGEX searchs.
I attempted a REGEX search backwards in an outline topic for any URL, but got a 'could not find, start at the beginning' standard message. The search works great forward. The outline topic has many URLs in it.
Is this an issue with REGEX? Any ideas?
- << Previous post in topic Next post in topic >>