Loading ...
Sorry, an error occurred while loading the content.

24129RE: [Clip] REGEX Search Backward

Expand Messages
  • John Shotsky
    Nov 3, 2013

      Additionally, many URL's are enclosed in angle brackets. In order to start the capture at the beginning of the url in every case, and assuming you don't want to capture the angle brackets if present, then another negative class should be added to the .+ term so that none of these things can be caught up in the greediness.

      ^!Find "(?s).+[^\r\n</\"][</"]*\K(https?://|www\.)[^\x20"\r\n<>]+" IORSW

      So, now the .+ can't end with <, " or /. If < or " are present, they are passed but not captured. Now, if the http is first, it will be captured, but if the www is first, it will be captured.

       

      Regards,
      John
      RecipeTools Web Site: http://recipetools.gotdns.com/
      John's Mags Yahoo Group:  http://groups.yahoo.com/group/johnsmags/

       

      From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Axel Berger
      Sent: Saturday, November 02, 2013 23:58
      To: ntb-clips@yahoogroups.com
      Subject: Re: [Clip] REGEX Search Backward

       

       

      nullclip@... wrote:

      > The regex finds and highlights only www.logicalchess.com/ instead of
      > the full http://www.logicalchess.com/.

      Yes, John already mentioned that problem himself. If the start can be
      either http or www and the term before is greedy, then you'll capture as
      little as possible. To solve this you have to look at what always comes
      directly before your string. It may be an equals (=) or a quote, if the
      URL is always placed in quotes. Assuming the latter I get:

      ^!Find "(?s).+"\K(https?://|www\.)[^\x20"\r\n<>]+" IORSW

      As you never specified what comes outside your search string, I had to
      guess here.

      Axel

    • Show all 25 messages in this topic