Loading ...
Sorry, an error occurred while loading the content.

regex help

Expand Messages
  • Don - HtmlFixIt.com
    I need to find a regex for the following: ^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P The PDBB186 could be any number of numbers/letters/spaces
    Message 1 of 6 , Apr 6, 2007
    • 0 Attachment
      I need to find a regex for the following:
      ^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P

      The PDBB186 could be any number of numbers/letters/spaces between 5 and
      20 all contained on one line.

      I am finding trouble matching the ^P for starts.

      I replace ^T with \t
      I replaced ^P with [[:space:]] and it seemed to work once.
      Not sure how to match it all. I hate regex much as I try to understand it.

      Here is another example with spaces in the number:
      ^P^P^T ^T^P^T ^T^P^T260 960 818 3 ^T ^T ^T^P^T^P^T ^P^P
    • Sheri
      ... Hi Don, There are two ways to go about it when you want to match a string of characters literally. One is to escape all the metacharacters in the string
      Message 2 of 6 , Apr 6, 2007
      • 0 Attachment
        Don - HtmlFixIt.com wrote:
        > I need to find a regex for the following:
        > ^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P
        >
        > The PDBB186 could be any number of numbers/letters/spaces between 5 and
        > 20 all contained on one line.
        >
        >
        Hi Don,

        There are two ways to go about it when you want to match a string of
        characters literally. One is to escape all the metacharacters in the
        string (in your case, you would have to escape all the caret symbols by
        inserting a backslash \ in front of each one) or surround the literal
        portions with \Q and \E.

        So to match the entire line above, literally either of these would do:

        \^P\^P\^T \^T\^P\^T \^T\^P\^TPDBB186 \^T \^T \^T\^P\^T\^P\^T \^P\^P
        or
        \Q^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P\E

        The part in the middle, you say can any combination of letters, numbers,
        spaces between 5 and 20 characters long.

        That could be specified as [0-9a-zA-Z ]{5,20}

        In the first option, it would just replace the PDBB186:
        \^P\^P\^T \^T\^P\^T \^T\^P\^T[0-9a-zA-Z ]{5,20} \^T \^T
        \^T\^P\^T\^P\^T \^P\^P

        In the second option you would have to use two sets of \Q and \E (one
        for the part that preceeds the numbers, letters and spaces, and another
        for the part that comes after).

        \Q^P^P^T ^T^P^T ^T^P^T\E[0-9a-zA-Z ]{5,20}\Q ^T ^T ^T^P^T^P^T ^P^P\E

        So, either of the above should work.

        There is a clip in my Clipcode Syntax package that will automatically
        escape a literal string for you.

        You can call it like this:

        H="regesc highlighted text"
        ^!Set %myvarname%="stuff4pattern"
        ^!Set %stuff4pattern%="^$GetSelection$"
        ^!FarClip "^$GetAppPath$cliphelp.clh:GetRegEscape"
        ^!Toolbar New Document
        ^!InsertCode ^%stuff4pattern%
        ;end of clip

        If I have misunderstood what you are doing, and the ^P is supposed to
        represent a line break and ^T is supposed to represent a tab, you will
        need to replace ^T with \t and ^P with \r\n

        Or, In NoteTab 5.2, you could try \R (which is a new metacharacter)
        instead of \r\n.

        Regards,
        Sheri
      • Don - HtmlFixIt.com
        yes this is what I meant by the ^P^T I suspect you have given me enough to figure it out. It was partially r n that was tossing me. I was trying one or the
        Message 3 of 6 , Apr 6, 2007
        • 0 Attachment
          yes this is what I meant by the ^P^T
          I suspect you have given me enough to figure it out. It was partially
          \r\n that was tossing me. I was trying one or the other and getting
          nowhere. I didn't know you had a regex clip set. Give me the
          advertisement for that! Where do I find it and so forth.

          > If I have misunderstood what you are doing, and the ^P is supposed to
          > represent a line break and ^T is supposed to represent a tab, you will
          > need to replace ^T with \t and ^P with \r\n
          >
          > Or, In NoteTab 5.2, you could try \R (which is a new metacharacter)
          > instead of \r\n.
        • Don - HtmlFixIt.com
          ... Here is what did the job for me Sheri: r n r n t t r n t t r n t[0-9a-zA-Z ]{5,30} t t t r n t r n t r n r n I am now onto my next problem, I
          Message 4 of 6 , Apr 7, 2007
          • 0 Attachment
            Sheri wrote:
            > Don - HtmlFixIt.com wrote:
            >> I need to find a regex for the following:
            >> ^P^P^T ^T^P^T ^T^P^TPDBB186 ^T ^T ^T^P^T^P^T ^P^P
            >>
            >> The PDBB186 could be any number of numbers/letters/spaces between 5 and
            >> 30 all contained on one line.
            >>

            Here is what did the job for me Sheri:
            \r\n\r\n\t \t\r\n\t \t\r\n\t[0-9a-zA-Z ]{5,30}\t \t \t\r\n\t\r\n\t
            \r\n\r\n


            I am now onto my next problem, I want to find this:
            \$ \t30\.74 \t\r\n\t\r\n\t

            Where 30 could be any amount of numbers and 74 could be any two numbers.
            I think this finds it:
            \$ \t[0-9]{0,7}\.[0-9]{2} \t\r\n\t\r\n\t

            However, I then want to replace it like this:
            $ \t####.## \r\n\r\n\t

            My problem is that I cannot get the replace part of it working. Does
            each part of that become a $1 $2 and so forth?
            The examples in the help file don't seem to work:
            Changes all h2 tags to h3:

            Find: <h2>(.*)</h2>

            Replace with: <h3>$1</h3>



            Places each encountered word on a single line (Replace All can take
            quite long on big files!):

            Find: \w*(['$#a-z0-9]+)\w*

            Replace with: $1\r\n



            Converts all encountered e-mail addresses to HTML Mailto links:

            Find: [a-z_.-0-9]+@[a-z_.-0-9]+

            Replace with: <a href="mailto:$0">$0</a>


            Because you now escape things like the < and the > sign I think??
          • Sheri
            Hi Don, Escaping metacharacters is only relevant to the search side of a replace command, the replacement string is *not* a regular expression. $0 in the
            Message 5 of 6 , Apr 7, 2007
            • 0 Attachment
              Hi Don,

              Escaping metacharacters is only relevant to the search side of a replace
              command, the replacement string is *not* a regular expression.

              $0 in the replacement string refers to the entire found string. There is
              always a $0.

              $1, $2, etc., are captured substrings. Captured substrings are parts of
              the regular expression that are in parentheses. Pairs of parentheses are
              counted from left to right, and unless you have nested parentheses or a
              long, complicated pattern with many parentheses, its easy to count them.
              (Learn to name substrings to avoid the need for counting or else use
              ^$GetReSubstrings$ to help you identify a substring by number in a
              complex pattern).

              So if you want to create a replacement string that references part of
              the found string, put parentheses around that part of your pattern.

              For example, with this:

              (\$ \t[0-9]{0,7}\.[0-9]{2}) \t\r\n\t\r\n\t

              The dollar sign, numbers, decimal point and 2 numbers will be in $1.
              Then your replace string could be:

              $1\r\n\r\n\t

              If that's what you need.

              >
              > My problem is that I cannot get the replace part of it working. Does
              > each part of that become a $1 $2 and so forth?
              > The examples in the help file don't seem to work:
              > Changes all h2 tags to h3:
              >
              > Find: <h2>(.*)</h2>
              >
              > Replace with: <h3>$1</h3>
              >
              >
              It works fine for me, what exactly is your clip command? Here's mine:

              ^!Replace "<h2>(.*)</h2>" >> "<h3>$1</h3>" RAWS

              > Because you now escape things like the < and the > sign I think??
              Not normally critical in the case of angle brackets (though angle
              brackets are special characters in some regex contexts, they are not
              actually metacharacters).

              Regards,
              Sheri
            • Adrien Verlee
              Hello, Case: [0] or [*] or [a] or [56] or [551] or [1005] or [666a] or [1a] ... etc. I need to find. The numbers, or numbers and characters, are always between
              Message 6 of 6 , Apr 10, 2007
              • 0 Attachment
                Hello,

                Case:
                [0] or [*] or [a] or [56] or [551] or [1005] or [666a] or [1a] ...
                etc. I need to find.
                The numbers, or numbers and characters, are always between the square
                brackets. Numbers and characters, or a combination of them, are
                variabel, the brackets are there always. And if the find-command
                match, the square brackets must also be selected.

                Till now I'm used: \[.] or \[..], etc, or: \[\d]
                But I can not find a way to match the variables at once, if, for
                example, there is one [5a] in a range of [0] - [1000] I must find it
                manually.

                The helpfile about Regular Expressions was a bit to complicated for
                me to fully understand.
                --
                adrien
              Your message has been successfully submitted and would be delivered to recipients shortly.