Loading ...
Sorry, an error occurred while loading the content.

Re: [Clip] regex help

Expand Messages
  • Sheri
    ... Hi Don, There are two ways to go about it when you want to match a string of characters literally. One is to escape all the metacharacters in the string
    Message 1 of 6 , Apr 6, 2007
    • 0 Attachment
      Don - HtmlFixIt.com wrote:
      > I need to find a regex for the following:
      > ^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P
      >
      > The PDBB186 could be any number of numbers/letters/spaces between 5 and
      > 20 all contained on one line.
      >
      >
      Hi Don,

      There are two ways to go about it when you want to match a string of
      characters literally. One is to escape all the metacharacters in the
      string (in your case, you would have to escape all the caret symbols by
      inserting a backslash \ in front of each one) or surround the literal
      portions with \Q and \E.

      So to match the entire line above, literally either of these would do:

      \^P\^P\^T \^T\^P\^T \^T\^P\^TPDBB186 \^T \^T \^T\^P\^T\^P\^T \^P\^P
      or
      \Q^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P\E

      The part in the middle, you say can any combination of letters, numbers,
      spaces between 5 and 20 characters long.

      That could be specified as [0-9a-zA-Z ]{5,20}

      In the first option, it would just replace the PDBB186:
      \^P\^P\^T \^T\^P\^T \^T\^P\^T[0-9a-zA-Z ]{5,20} \^T \^T
      \^T\^P\^T\^P\^T \^P\^P

      In the second option you would have to use two sets of \Q and \E (one
      for the part that preceeds the numbers, letters and spaces, and another
      for the part that comes after).

      \Q^P^P^T ^T^P^T ^T^P^T\E[0-9a-zA-Z ]{5,20}\Q ^T ^T ^T^P^T^P^T ^P^P\E

      So, either of the above should work.

      There is a clip in my Clipcode Syntax package that will automatically
      escape a literal string for you.

      You can call it like this:

      H="regesc highlighted text"
      ^!Set %myvarname%="stuff4pattern"
      ^!Set %stuff4pattern%="^$GetSelection$"
      ^!FarClip "^$GetAppPath$cliphelp.clh:GetRegEscape"
      ^!Toolbar New Document
      ^!InsertCode ^%stuff4pattern%
      ;end of clip

      If I have misunderstood what you are doing, and the ^P is supposed to
      represent a line break and ^T is supposed to represent a tab, you will
      need to replace ^T with \t and ^P with \r\n

      Or, In NoteTab 5.2, you could try \R (which is a new metacharacter)
      instead of \r\n.

      Regards,
      Sheri
    • Don - HtmlFixIt.com
      yes this is what I meant by the ^P^T I suspect you have given me enough to figure it out. It was partially r n that was tossing me. I was trying one or the
      Message 2 of 6 , Apr 6, 2007
      • 0 Attachment
        yes this is what I meant by the ^P^T
        I suspect you have given me enough to figure it out. It was partially
        \r\n that was tossing me. I was trying one or the other and getting
        nowhere. I didn't know you had a regex clip set. Give me the
        advertisement for that! Where do I find it and so forth.

        > If I have misunderstood what you are doing, and the ^P is supposed to
        > represent a line break and ^T is supposed to represent a tab, you will
        > need to replace ^T with \t and ^P with \r\n
        >
        > Or, In NoteTab 5.2, you could try \R (which is a new metacharacter)
        > instead of \r\n.
      • Don - HtmlFixIt.com
        ... Here is what did the job for me Sheri: r n r n t t r n t t r n t[0-9a-zA-Z ]{5,30} t t t r n t r n t r n r n I am now onto my next problem, I
        Message 3 of 6 , Apr 7, 2007
        • 0 Attachment
          Sheri wrote:
          > Don - HtmlFixIt.com wrote:
          >> I need to find a regex for the following:
          >> ^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P
          >>
          >> The PDBB186 could be any number of numbers/letters/spaces between 5 and
          >> 30 all contained on one line.
          >>

          Here is what did the job for me Sheri:
          \r\n\r\n\t \t\r\n\t \t\r\n\t[0-9a-zA-Z ]{5,30}\t \t \t\r\n\t\r\n\t
          \r\n\r\n


          I am now onto my next problem, I want to find this:
          \$ \t30\.74 \t\r\n\t\r\n\t

          Where 30 could be any amount of numbers and 74 could be any two numbers.
          I think this finds it:
          \$ \t[0-9]{0,7}\.[0-9]{2} \t\r\n\t\r\n\t

          However, I then want to replace it like this:
          $ \t####.## \r\n\r\n\t

          My problem is that I cannot get the replace part of it working. Does
          each part of that become a $1 $2 and so forth?
          The examples in the help file don't seem to work:
          Changes all h2 tags to h3:

          Find: <h2>(.*)</h2>

          Replace with: <h3>$1</h3>



          Places each encountered word on a single line (Replace All can take
          quite long on big files!):

          Find: \w*(['$#a-z0-9]+)\w*

          Replace with: $1\r\n



          Converts all encountered e-mail addresses to HTML Mailto links:

          Find: [a-z_.-0-9]+@[a-z_.-0-9]+

          Replace with: <a href="mailto:$0">$0</a>


          Because you now escape things like the < and the > sign I think??
        • Sheri
          Hi Don, Escaping metacharacters is only relevant to the search side of a replace command, the replacement string is *not* a regular expression. $0 in the
          Message 4 of 6 , Apr 7, 2007
          • 0 Attachment
            Hi Don,

            Escaping metacharacters is only relevant to the search side of a replace
            command, the replacement string is *not* a regular expression.

            $0 in the replacement string refers to the entire found string. There is
            always a $0.

            $1, $2, etc., are captured substrings. Captured substrings are parts of
            the regular expression that are in parentheses. Pairs of parentheses are
            counted from left to right, and unless you have nested parentheses or a
            long, complicated pattern with many parentheses, its easy to count them.
            (Learn to name substrings to avoid the need for counting or else use
            ^$GetReSubstrings$ to help you identify a substring by number in a
            complex pattern).

            So if you want to create a replacement string that references part of
            the found string, put parentheses around that part of your pattern.

            For example, with this:

            (\$ \t[0-9]{0,7}\.[0-9]{2}) \t\r\n\t\r\n\t

            The dollar sign, numbers, decimal point and 2 numbers will be in $1.
            Then your replace string could be:

            $1\r\n\r\n\t

            If that's what you need.

            >
            > My problem is that I cannot get the replace part of it working. Does
            > each part of that become a $1 $2 and so forth?
            > The examples in the help file don't seem to work:
            > Changes all h2 tags to h3:
            >
            > Find: <h2>(.*)</h2>
            >
            > Replace with: <h3>$1</h3>
            >
            >
            It works fine for me, what exactly is your clip command? Here's mine:

            ^!Replace "<h2>(.*)</h2>" >> "<h3>$1</h3>" RAWS

            > Because you now escape things like the < and the > sign I think??
            Not normally critical in the case of angle brackets (though angle
            brackets are special characters in some regex contexts, they are not
            actually metacharacters).

            Regards,
            Sheri
          • Adrien Verlee
            Hello, Case: [0] or [*] or [a] or [56] or [551] or [1005] or [666a] or [1a] ... etc. I need to find. The numbers, or numbers and characters, are always between
            Message 5 of 6 , Apr 10, 2007
            • 0 Attachment
              Hello,

              Case:
              [0] or [*] or [a] or [56] or [551] or [1005] or [666a] or [1a] ...
              etc. I need to find.
              The numbers, or numbers and characters, are always between the square
              brackets. Numbers and characters, or a combination of them, are
              variabel, the brackets are there always. And if the find-command
              match, the square brackets must also be selected.

              Till now I'm used: \[.] or \[..], etc, or: \[\d]
              But I can not find a way to match the variables at once, if, for
              example, there is one [5a] in a range of [0] - [1000] I must find it
              manually.

              The helpfile about Regular Expressions was a bit to complicated for
              me to fully understand.
              --
              adrien
            Your message has been successfully submitted and would be delivered to recipients shortly.