Loading ...
Sorry, an error occurred while loading the content.

Changing Case with Regex

Expand Messages
  • John Shotsky
    I am processing fairly large (~1M) text files, in which certain words need to be capitalized, or upper cased. I would like to use ^!Replace statements to
    Message 1 of 10 , Jun 8, 2008
    • 0 Attachment
      I am processing fairly large (~1M) text files, in which certain words
      need to be capitalized, or upper cased. I would like to use ^!Replace
      statements to change them.

      As an example text:

      Title: a dog's tale

      I would like to replace as follows:
      ^!Replace "^(Title: )(.+)\R >> "$1uppercase$2\n" ARTSW

      The result would be

      Title: A DOG'S TALE

      I have been able to do it with a ahort subroutine, using the Toolbar
      Uppercase command, but that adds a lot of time to the processing of
      large files.

      Can anyone help?

      Does anyone that reads this forum know if the \L, \l, \U, \u commands
      are going to be added any time soon? (Which would make the above
      function easy...

      John
    • Sheri
      ... NoteTab uses a 3rd Party Delphi component in its regex processing; the Delphi component provides the replacement tokens. In February this year, I
      Message 2 of 10 , Jun 8, 2008
      • 0 Attachment
        John Shotsky wrote:
        > I am processing fairly large (~1M) text files, in which certain words
        > need to be capitalized, or upper cased. I would like to use ^!Replace
        > statements to change them.
        >
        > As an example text:
        >
        > Title: a dog's tale
        >
        > I would like to replace as follows:
        > ^!Replace "^(Title: )(.+)\R >> "$1uppercase$2\n" ARTSW
        >
        > The result would be
        >
        > Title: A DOG'S TALE
        >
        > I have been able to do it with a ahort subroutine, using the Toolbar
        > Uppercase command, but that adds a lot of time to the processing of
        > large files.
        >
        > Can anyone help?
        >
        > Does anyone that reads this forum know if the \L, \l, \U, \u commands
        > are going to be added any time soon? (Which would make the above
        > function easy...
        >
        NoteTab uses a 3rd Party Delphi component in its regex processing; the
        Delphi component provides the replacement tokens. In February this year,
        I corresponded with the component's author and suggested some
        enhancements, including case modifiers. His disappointing response was
        that he preferred not to implement non-standard regex replacements.

        You can make a NoteTab clip that uses ^!Find instead of ^!Replace
        commands, loops through the document, and uses the ^$StrUpper$ function
        to transform the highlighted text. For example

        ^!Find "(?<=Title: ).+" RS

        should highlight the right thing.

        ^!InsertText ^$StrUpper(^$GetSelection$)$

        should transform it to upper case.

        If you would care to install PowerPro I could help you can do that with
        a fast regex.pcrereplace function that includes case modifiers, e.g.,
        $U2. It also supports a count index token $#. Its other replacement
        tokens are pretty much identical to NoteTab's. PowerPro is not an
        editor. It has Windows scripting capability. There is a ready made GUI
        script available for it that allows you to display a file, process regex
        replacements, and save the result.

        The main program is here: http://powerpro.webeddie.com/

        Regards,
        Sheri
      • Don - HtmlFixIt.com
        I was thinking if you just use the function that assigns regex to variables, then you could do it that way, but I can never find that function in help. It is
        Message 3 of 10 , Jun 8, 2008
        • 0 Attachment
          I was thinking if you just use the function that assigns regex to
          variables, then you could do it that way, but I can never find that
          function in help. It is a fairly new function ... I think we were just
          discussing it last week and I know I used it in the last week in a clip,
          but cannot find it once again ...

          Don
        • Sheri
          ... ^$GetReSubStrings$ ? Its used after a ^!Find command.
          Message 4 of 10 , Jun 8, 2008
          • 0 Attachment
            Don - HtmlFixIt.com wrote:
            > I was thinking if you just use the function that assigns regex to
            > variables, then you could do it that way, but I can never find that
            > function in help. It is a fairly new function ... I think we were just
            > discussing it last week and I know I used it in the last week in a clip,
            > but cannot find it once again ...
            >
            > Don
            >
            >
            ^$GetReSubStrings$ ? Its used after a ^!Find command.
          • Don - HtmlFixIt.com
            ... Right that should do it. Regex find, use getresubstring (how will I ever remember this one? -- and why don t I find it if I look for variable and regex in
            Message 5 of 10 , Jun 8, 2008
            • 0 Attachment
              Sheri wrote:
              > Don - HtmlFixIt.com wrote:
              >> I was thinking if you just use the function that assigns regex to
              >> variables, then you could do it that way, but I can never find that
              >> function in help. It is a fairly new function ... I think we were just
              >> discussing it last week and I know I used it in the last week in a clip,
              >> but cannot find it once again ...
              >>
              >> Don
              >>
              >>
              > ^$GetReSubStrings$ ? Its used after a ^!Find command.

              Right that should do it. Regex find, use getresubstring (how will I
              ever remember this one? -- and why don't I find it if I look for
              variable and regex in help?) then uppercase the right variable and put
              them back in.
            • John Shotsky
              Here s the loop I m using - can this be made faster? I only locate those which include a lower case character in the find. I use the toolbar function, but I
              Message 6 of 10 , Jun 8, 2008
              • 0 Attachment
                Here's the loop I'm using - can this be made faster?

                I only locate those which include a lower case character in the find. I use the toolbar function, but I don't know if
                there's a time penalty for this type of loop. There are about 45K lines in my typical document.And about 1000 expected
                'finds'.

                :Loop
                ^!Find "^Title: .+[a-z]" ARSTW
                ^!IfError GoTo End
                ^!SELECT LINE
                ^!Toolbar "upper case"
                ^!Goto Loop
                :End

                John

                From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of Sheri
                Sent: Sunday, June 08, 2008 5:30 PM
                To: ntb-scripts@yahoogroups.com
                Subject: Re: [NTS] Changing Case with Regex

                Don - HtmlFixIt.com wrote:
                > I was thinking if you just use the function that assigns regex to
                > variables, then you could do it that way, but I can never find that
                > function in help. It is a fairly new function ... I think we were just
                > discussing it last week and I know I used it in the last week in a clip,
                > but cannot find it once again ...
                >
                > Don
                >
                >
                ^$GetReSubStrings$ ? Its used after a ^!Find command.



                [Non-text portions of this message have been removed]
              • Sheri
                ... There is a time penalty for loops but it should be at least twice as fast this way: ^!SetScreenUpdate Off ^!Set %start%=^$GetDate(hh:mm:ss)$ ^!StatusShow
                Message 7 of 10 , Jun 8, 2008
                • 0 Attachment
                  John Shotsky wrote:
                  > Here's the loop I'm using - can this be made faster?
                  >
                  > I only locate those which include a lower case character in the find. I use the toolbar function, but I don't know if
                  > there's a time penalty for this type of loop. There are about 45K lines in my typical document.And about 1000 expected
                  > 'finds'.
                  >
                  > :Loop
                  > ^!Find "^Title: .+[a-z]" ARSTW
                  > ^!IfError GoTo End
                  > ^!SELECT LINE
                  > ^!Toolbar "upper case"
                  > ^!Goto Loop
                  > :End
                  >
                  There is a time penalty for loops but it should be at least twice as fast this way:

                  ^!SetScreenUpdate Off
                  ^!Set %start%=^$GetDate(hh:mm:ss)$
                  ^!StatusShow Working
                  ^!Jump Doc_Start
                  :Loop
                  ^!Find "(?-i)(?<=^Title: ).*[a-z].*$" RS
                  ^!IfError GoTo Finish
                  ^$StrUpper(^$GetSelection$)$
                  ^!Goto Loop
                  :Finish
                  ^!StatusShow
                  ^!Set %endtime%=^$GetDate(hh:mm:ss)$
                  ^!Info ^%start%^%nl%^%endtime%
                  ;end of clip

                  Btw, you previously said you wanted
                  Title: A DOG'S TALE
                  but your loop produces
                  TITLE: A DOG'S TALE
                • John Shotsky
                  Thanks! And, you re right, it was uppercasing the Title also.I like your version much better.. John From: ntb-scripts@yahoogroups.com
                  Message 8 of 10 , Jun 8, 2008
                  • 0 Attachment
                    Thanks! And, you're right, it was uppercasing the Title also.I like your version much better..

                    John

                    From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of Sheri
                    Sent: Sunday, June 08, 2008 9:00 PM
                    To: ntb-scripts@yahoogroups.com
                    Subject: Re: [NTS] Changing Case with Regex

                    John Shotsky wrote:
                    > Here's the loop I'm using - can this be made faster?
                    >
                    > I only locate those which include a lower case character in the find. I use the toolbar function, but I don't know if
                    > there's a time penalty for this type of loop. There are about 45K lines in my typical document.And about 1000 expected
                    > 'finds'.
                    >
                    > :Loop
                    > ^!Find "^Title: .+[a-z]" ARSTW
                    > ^!IfError GoTo End
                    > ^!SELECT LINE
                    > ^!Toolbar "upper case"
                    > ^!Goto Loop
                    > :End
                    >
                    There is a time penalty for loops but it should be at least twice as fast this way:

                    ^!SetScreenUpdate Off
                    ^!Set %start%=^$GetDate(hh:mm:ss)$
                    ^!StatusShow Working
                    ^!Jump Doc_Start
                    :Loop
                    ^!Find "(?-i)(?<=^Title: ).*[a-z].*$" RS
                    ^!IfError GoTo Finish
                    ^$StrUpper(^$GetSelection$)$
                    ^!Goto Loop
                    :Finish
                    ^!StatusShow
                    ^!Set %endtime%=^$GetDate(hh:mm:ss)$
                    ^!Info ^%start%^%nl%^%endtime%
                    ;end of clip

                    Btw, you previously said you wanted
                    Title: A DOG'S TALE
                    but your loop produces
                    TITLE: A DOG'S TALE



                    [Non-text portions of this message have been removed]
                  • Sheri
                    ... Just noticed, that ^!IfError shouldn t have the word GoTo in it... s/b: ^!IfError Finish
                    Message 9 of 10 , Jun 8, 2008
                    • 0 Attachment
                      John Shotsky wrote:
                      > Thanks! And, you're right, it was uppercasing the Title also.I like your version much better..
                      >
                      Just noticed, that ^!IfError shouldn't have the word GoTo in it...

                      s/b:

                      ^!IfError Finish
                    • Alec Burgess
                      On Sun, Jun 8, 2008 at 10:15 PM, Don - HtmlFixIt.com ... Right that should do it. Regex find, use getresubstring (how will I ... In
                      Message 10 of 10 , Jun 10, 2008
                      • 0 Attachment
                        On Sun, Jun 8, 2008 at 10:15 PM, Don - HtmlFixIt.com <don@...>
                        wrote:
                        > ^$GetReSubStrings$ ? Its used after a ^!Find command.

                        Right that should do it. Regex find, use getresubstring (how will I
                        > ever remember this one? -- and why don't I find it if I look for
                        > variable and regex in help?) then uppercase the right variable and put
                        > them back in
                        >

                        In Help-favorites for Clips I've put the Find-Replace topic because I ALWAYS
                        forget the names of the new regex associated functions too! (Also the pages
                        that define what all the funny symbols & and * mean where and when and what
                        the pre-defined variables and goto targets are. I ALWAYS have to look for
                        them to use in a script :-(

                        --
                        Regards ... Alec
                        --


                        [Non-text portions of this message have been removed]
                      Your message has been successfully submitted and would be delivered to recipients shortly.