Loading ...
Sorry, an error occurred while loading the content.

Re: [NTS] Changing CR & LF - clip addendum

Expand Messages
  • loro
    ... But you want to do it with a clip, right? I sort of missed that part. :-) You could use ^$GetFileText()$ to grab the text from the disk file,
    Message 1 of 16 , Jun 1, 2008
    • 0 Attachment
      I wrote:
      >If you instead use the disk search (F3) I think it'll work. I don't
      >know regex, so I won't go there, but I know it works when I use the
      >NTP tokens for LF and CRLF, ^L and ^P. No reason it shouldn't work
      >with rexexp.

      But you want to do it with a clip, right? I sort of missed that part. :-)

      You could use ^$GetFileText()$ to grab the text from the disk file,
      ^$StrReplace()$ to replace all ^L with ^P and ^!TextToFile to save
      the result. If you use the same file name as the original file it
      will be replaced, so the result will be the same as if you'd used
      ^!Replace. Make the clip reload the document in Notetab (I'm assuming
      the document has been open and focused from start). From there you
      should be able to go on with your clip as planned.

      Lotta
    • John Shotsky
      I use Regex expressions quite a bit, and line terminations are an issue I deal with a lot. NoteTab considers R to be any combination of line terminators. n
      Message 2 of 16 , Jun 1, 2008
      • 0 Attachment
        I use Regex expressions quite a bit, and line terminations are an issue I deal with a lot.

        NoteTab considers \R to be any combination of line terminators.
        \n is just the line feed.

        So, I convert as follows:
        ^!Replace "\R" >> "\n" ARSTW

        This converts any kind of terminator to a line feed. I do this because I perform a lot of operations with clips on text,
        so \n is my preferred 'holding' format.

        When I exit the clips, I convert back to standard Windows text format:

        ^!Replace "\n" >> "^%NL%" ARSTW

        John

        From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of loro
        Sent: Sunday, June 01, 2008 1:36 PM
        To: ntb-scripts@yahoogroups.com
        Subject: Re: [NTS] Changing CR & LF

        Art Kocsis wrote:
        >I am trying to make a small clip efficient by using RegEx but am getting
        >nowhere.
        >
        >What I am trying to do is convert any text file (Unix or Windows) to a
        >single spaced
        >windows file, i.e., one using CRLF as a line terminator.
        >
        >My plan was to change all LFs (0Ah), to CRs (0Dh), and then collapse all
        >single and multiple CRs
        >to a single NL (0D0Ah).
        >
        > ^!Replace /l+ >> /r TIWR
        > ^!Replace /r+ >> /n TIWR
        >or
        > ^!Replace /r+ >> /l R
        > ^!Replace /l+ >> /n R
        >
        >However, this does not work as NoteTab (Std or Pro, v4.95), seems to
        >ignore any changes to the LF char. In fact, all it does is complain with
        >"/l+" Not Found.

        I think that Notetab always uses CR/LF when it has a document open.
        If you instead use the disk search (F3) I think it'll work. I don't
        know regex, so I won't go there, but I know it works when I use the
        NTP tokens for LF and CRLF, ^L and ^P. No reason it shouldn't work with rexexp.

        Lotta



        [Non-text portions of this message have been removed]
      • Don - HtmlFixIt.com
        Regex in 4.x just isn t up to snuff. Go get 5.x version of notetab.
        Message 3 of 16 , Jun 1, 2008
        • 0 Attachment
          Regex in 4.x just isn't up to snuff. Go get 5.x version of notetab.
        • loro
          ... Can Notetab really distinguish between different line endings in an open doc, even with regex? If so, what are those mysterious extra bytes in the file
          Message 4 of 16 , Jun 1, 2008
          • 0 Attachment
            Don wrote:
            >Regex in 4.x just isn't up to snuff. Go get 5.x version of notetab.

            Can Notetab really distinguish between different line endings in an
            open doc, even with regex? If so, what are those mysterious extra
            bytes in the file size on the status bar when you save as Unix or Mac?

            Come to think of it, the easiest way to convert all line endings to
            CR/LF must be to resave the document as DOS/Windows, either manually
            or let the clip temporarily change the ini value. If you have the
            Save As Format set to something else than Windows to start with, that
            is. If you use Windows just resave.

            Lotta
          • John Shotsky
            The answer to your first question is yes, definitely. Each line terminator type has its own symbol. R includes all of them. It s all documented in the help
            Message 5 of 16 , Jun 1, 2008
            • 0 Attachment
              The answer to your first question is yes, definitely. Each line terminator type has its own symbol. \R includes all of
              them. It's all documented in the help system.

              I'm reminded that I've loaded Unicode files without realizing it in the past. When I realized that, I would copy the
              contents, then paste into a new document, and save. The new file is about 50% the size of the original.

              John

              From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of loro
              Sent: Sunday, June 01, 2008 3:47 PM
              To: ntb-scripts@yahoogroups.com
              Subject: Re: [NTS] Changing CR & LF

              Don wrote:
              >Regex in 4.x just isn't up to snuff. Go get 5.x version of notetab.

              Can Notetab really distinguish between different line endings in an
              open doc, even with regex? If so, what are those mysterious extra
              bytes in the file size on the status bar when you save as Unix or Mac?

              Come to think of it, the easiest way to convert all line endings to
              CR/LF must be to resave the document as DOS/Windows, either manually
              or let the clip temporarily change the ini value. If you have the
              Save As Format set to something else than Windows to start with, that
              is. If you use Windows just resave.

              Lotta



              [Non-text portions of this message have been removed]
            • Art Kocsis
              Warning - long post, question at end. Well, Thanks to you guys and the help files some progress has been made. Probably most of this is quite well known to you
              Message 6 of 16 , Jun 3, 2008
              • 0 Attachment
                Warning - long post, question at end.

                Well, Thanks to you guys and the help files some progress has been made.
                Probably most of this is quite well known to you all but I need to document
                what I have learned and maybe it will do someone else some good.

                I don't know about you but oftentimes with a problem such as this I get very
                frustrated. Seemingly duplicate tests yield different results. It typically
                means something else is going on that is unknown. That is what happened here.

                Lotta, you gave me a clue but I glossed over it until I found it explicitly
                in the RegEx help file:

                "all types of documents are temporarily converted to Windows texts
                for display purposes" [i.e., CRLF for line terminators]

                So all my test cases - using CR (0Dh), LF (0Ah) or CRLF (0D0Ah) as line
                terminators - were identical inside NoteTab. This is true for NoteTab 4.95
                as well as 5.61.

                Test (EOL=0A).txt
                --------------------------------------------------------------
                00000000 24242424 24240A24 24242424 240A2424 $$$$$$?$$$$$$?$$
                00000010 24242424 0A252525 $$$$?%%%

                Test (EOL=0A) Cut & Paste to UE.txt
                --------------------------------------------------------------
                00000000 24242424 24240D0A 24242424 24240D0A $$$$$$??$$$$$$??
                00000010 24242424 24240D0A 252525 $$$$$$??%%%

                So I could load this Unix file, look at it, play with it, add new
                lines and then save it. When I looked at the saved file it still
                has the LF terminators. However, when I cut and pasted the text
                from NoteTab into a hex editor, I can compare the disk contents
                to the NoteTab contents and see the CRLF terminator replacement.
                The same holds true for a Mac file. So that little mystery is
                resolved.

                Just to be pedantic and unambiguous a Unix file uses a LF (0Ah)
                for its line terminator, a Mac file uses a CR (0Dh) and a Windows
                file uses both CRLF (0D0Ah) in that order.

                I also discovered that a right click on a tab results in a pull
                down menu that has a "Save Format" sub-menu where one can specify
                the save format to be Windows, Unix, Mac, EBCDIC or Original as
                well as use ANSI or ASCII character sets. That would have been
                nice to notice years ago instead of just now ;(. This only sets
                the save mode for that tab only, however. Is there a global default
                setting somewhere? the default now seems to be "Original".

                So back to the RegEx problem. I have to say the RegEx help file
                alone was worth the $10 for the upgrade. [That doesn't mean that
                I know RegEx, it is by far, a LOT clearer and more comprehensive
                than anything else that I found.] I was also struck by how complex
                RegEx and its implementation is. My head is still reeling and I may
                be more confused now than when I started.

                According to the RegEx help file, the NoteTab implementation operates
                in non-utf8 mode. [Someday I will take the time to find out just what
                utf8 is!] And Notetab defaults to BSR_ANYCRLF mode:

                "NoteTab version 5.61 uses PCRE's newline option of ANYCRLF by default.
                Earlier
                versions defaulted to CRLF. ... NoteTab also uses by default PCRE's
                BSR_ANYCRLF
                option, which allows \R (i.e., backslash-R) to match linebreak
                characters
                related to Windows, Unix and Mac text files."

                As I understand it, (BSR is an abbreviation for "backslash R".), the
                BSR_ANYCRLF
                mode means that \R will match CR, LF, or CRLF line endings but not any
                Unicode ending.
                I guess this is good but it seems redundant (except for disk files), since
                all line endings are
                converted to CRLF while editing in windows. What I don't understand is the
                discussion about
                specifying a newline convention via (*CR), (*LF), (*CRLF), (*ANYCRLF) or
                (*ANY). Is this
                redundant with BSR_ANYCRLF, more fine tuned, or ??? Are these something to
                set globally or
                in front of every pattern?

                So it would appear that I could implement my clip to remove all double line
                endings (i.e., to
                create a single spaced document), by simply replacing all one or more
                occurrences of CRLF
                to a single CRLF and then setting the format mode to Windows/DOS (or at
                least saving it as
                Windows/DOS). Something like: CRLF+ >> CRLF

                The RegEx help doesn't say anything about how to use \R so I may have to
                take back some
                of my praise for the help file. The regular Notetab helps some.

                However, John's technique does seem to work - at least resulting is single
                spacing.

                ^!Replace "\R+" >> "\n" ARSTW
                ^!Replace "\n" >> "^%NL%" ARSTW

                This seems inefficient, so ...

                ^!Replace "\R+" >> "^%NL%" ARSTW

                also works [once I get the right syntax<g>].

                This seems almost ridiculous - hours and hours of hair pulling and
                frustration for
                a single line of code!!! Yet this part works much faster than the non RegEx
                clip.

                So now I am left with the problem of setting the save format mode to windows.
                I would prefer not actually saving the file, just setting the mode for a
                later save.
                Lotta, you mentioned something about changing an INI value in a clip. What
                value? Can I make the Windows mode a default or force all saves to be windows
                mode?

                Thanks again for all your help (and patience for reading all of this!).

                Art
              • John Shotsky
                Art, Nice essay. Just for clarification, my usage doesn t use those two clips in sequence - one is used at the beginning, the other is used at the end. I do
                Message 7 of 16 , Jun 3, 2008
                • 0 Attachment
                  Art,

                  Nice essay.

                  Just for clarification, my usage doesn't use those two clips in sequence - one is used at the beginning, the other is
                  used at the end. I do all the processing within my clip libraries (which now run to thousands of lines) using \n
                  internally. That is, it is an easy form in which to work within the clips, but they are all changed back at exit.

                  In Options, you can change your 'Save As' format, but it will not really affect the document you have opened - it
                  applies to new documents. It is counterintuitive, but if you open a Unicode document, then save it, you will see that
                  it's still in Unicode regardless of the format you have specified for output. You can see this by looking at the file
                  size. If you copy the contents of that document and paste into a new document, then save, you'll see that the file size
                  has reduced by 50%, as it should. This really threw me, because my clips are not designed to run on Unicode documents,
                  so the clip library would just fail miserably. I was getting Unicode from OCR because of wider support for symbols.so
                  now, if I receive Unicode documents, I immediately convert them by copy/paste/save, and go from there.

                  You can

                  From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of Art Kocsis
                  Sent: Tuesday, June 03, 2008 6:54 AM
                  To: ntb-scripts@yahoogroups.com
                  Subject: RE: [NTS] Changing CR & LF

                  Warning - long post, question at end.

                  Well, Thanks to you guys and the help files some progress has been made.
                  Probably most of this is quite well known to you all but I need to document
                  what I have learned and maybe it will do someone else some good.

                  I don't know about you but oftentimes with a problem such as this I get very
                  frustrated. Seemingly duplicate tests yield different results. It typically
                  means something else is going on that is unknown. That is what happened here.

                  Lotta, you gave me a clue but I glossed over it until I found it explicitly
                  in the RegEx help file:

                  "all types of documents are temporarily converted to Windows texts
                  for display purposes" [i.e., CRLF for line terminators]

                  So all my test cases - using CR (0Dh), LF (0Ah) or CRLF (0D0Ah) as line
                  terminators - were identical inside NoteTab. This is true for NoteTab 4.95
                  as well as 5.61.

                  Test (EOL=0A).txt
                  ----------------------------------------------------------
                  00000000 24242424 24240A24 24242424 240A2424 $$$$$$?$$$$$$?$$
                  00000010 24242424 0A252525 $$$$?%%%

                  Test (EOL=0A) Cut & Paste to UE.txt
                  ----------------------------------------------------------
                  00000000 24242424 24240D0A 24242424 24240D0A $$$$$$??$$$$$$??
                  00000010 24242424 24240D0A 252525 $$$$$$??%%%

                  So I could load this Unix file, look at it, play with it, add new
                  lines and then save it. When I looked at the saved file it still
                  has the LF terminators. However, when I cut and pasted the text
                  from NoteTab into a hex editor, I can compare the disk contents
                  to the NoteTab contents and see the CRLF terminator replacement.
                  The same holds true for a Mac file. So that little mystery is
                  resolved.

                  Just to be pedantic and unambiguous a Unix file uses a LF (0Ah)
                  for its line terminator, a Mac file uses a CR (0Dh) and a Windows
                  file uses both CRLF (0D0Ah) in that order.

                  I also discovered that a right click on a tab results in a pull
                  down menu that has a "Save Format" sub-menu where one can specify
                  the save format to be Windows, Unix, Mac, EBCDIC or Original as
                  well as use ANSI or ASCII character sets. That would have been
                  nice to notice years ago instead of just now ;(. This only sets
                  the save mode for that tab only, however. Is there a global default
                  setting somewhere? the default now seems to be "Original".

                  So back to the RegEx problem. I have to say the RegEx help file
                  alone was worth the $10 for the upgrade. [That doesn't mean that
                  I know RegEx, it is by far, a LOT clearer and more comprehensive
                  than anything else that I found.] I was also struck by how complex
                  RegEx and its implementation is. My head is still reeling and I may
                  be more confused now than when I started.

                  According to the RegEx help file, the NoteTab implementation operates
                  in non-utf8 mode. [Someday I will take the time to find out just what
                  utf8 is!] And Notetab defaults to BSR_ANYCRLF mode:

                  "NoteTab version 5.61 uses PCRE's newline option of ANYCRLF by default.
                  Earlier
                  versions defaulted to CRLF. ... NoteTab also uses by default PCRE's
                  BSR_ANYCRLF
                  option, which allows \R (i.e., backslash-R) to match linebreak
                  characters
                  related to Windows, Unix and Mac text files."

                  As I understand it, (BSR is an abbreviation for "backslash R".), the
                  BSR_ANYCRLF
                  mode means that \R will match CR, LF, or CRLF line endings but not any
                  Unicode ending.
                  I guess this is good but it seems redundant (except for disk files), since
                  all line endings are
                  converted to CRLF while editing in windows. What I don't understand is the
                  discussion about
                  specifying a newline convention via (*CR), (*LF), (*CRLF), (*ANYCRLF) or
                  (*ANY). Is this
                  redundant with BSR_ANYCRLF, more fine tuned, or ??? Are these something to
                  set globally or
                  in front of every pattern?

                  So it would appear that I could implement my clip to remove all double line
                  endings (i.e., to
                  create a single spaced document), by simply replacing all one or more
                  occurrences of CRLF
                  to a single CRLF and then setting the format mode to Windows/DOS (or at
                  least saving it as
                  Windows/DOS). Something like: CRLF+ >> CRLF

                  The RegEx help doesn't say anything about how to use \R so I may have to
                  take back some
                  of my praise for the help file. The regular Notetab helps some.

                  However, John's technique does seem to work - at least resulting is single
                  spacing.

                  ^!Replace "\R+" >> "\n" ARSTW
                  ^!Replace "\n" >> "^%NL%" ARSTW

                  This seems inefficient, so ...

                  ^!Replace "\R+" >> "^%NL%" ARSTW

                  also works [once I get the right syntax<g>].

                  This seems almost ridiculous - hours and hours of hair pulling and
                  frustration for
                  a single line of code!!! Yet this part works much faster than the non RegEx
                  clip.

                  So now I am left with the problem of setting the save format mode to windows.
                  I would prefer not actually saving the file, just setting the mode for a
                  later save.
                  Lotta, you mentioned something about changing an INI value in a clip. What
                  value? Can I make the Windows mode a default or force all saves to be windows
                  mode?

                  Thanks again for all your help (and patience for reading all of this!).

                  Art



                  [Non-text portions of this message have been removed]
                • Art Kocsis
                  Hi John, Thanks for the tip. I wonder how many other settings in the options have I not seen all these years. Actually, setting the Options | Documents |
                  Message 8 of 16 , Jun 3, 2008
                  • 0 Attachment
                    Hi John,

                    Thanks for the tip. I wonder how many other settings in the options have I
                    not seen all these years. Actually, setting the "Options | Documents |
                    Format Save As:" to "windows" DOES
                    work for existing documents (unless you have over ridden the setting via a
                    right click). It even
                    works for docs that are already loaded. Try it. Load some docs of various
                    types, change the
                    "save as" mode and then right click on the tabs to verify the "save as"
                    mode. Anyway, what
                    used to take forever (30 seconds or more for large files) is now down to
                    less than a second.
                    RegEx rocks!!! [But I do wish it had easier syntax.]

                    Yes, I knew you did your clip processing between the two replace
                    statements. However, my
                    first attempt at a one liner didn't work so I tried your sequence. Then
                    comparing them I found
                    I had missed a caret before the %NL%. RegEx is nice but it is extremely
                    picky about syntax.

                    Regarding the Unicode, I don't run into Unicode files very often but did
                    discover that copy and
                    save trick. I noticed that the "Options | General | Protect Unicode Files"
                    had been checked by
                    default. Would unchecking it eliminate the need to copy and save? I don't
                    have a known
                    Unicode file to check it out.

                    Thanks again, Art

                    At 6/3/2008 07:23 AM, John Shotsky wrote:
                    >Art,
                    >
                    >Nice essay.
                    >
                    >Just for clarification, my usage doesn't use those two clips in sequence -
                    >one is used at the beginning, the other is used at the end. I do all the
                    >processing within my clip libraries (which
                    >now run to thousands of lines) using \n internally. That is, it is an easy
                    >form in which to work
                    >within the clips, but they are all changed back at exit.
                    >
                    >In Options, you can change your 'Save As' format, but it will not really
                    >affect the document you have opened - it applies to new documents. It is
                    >counterintuitive, but if you open a Unicode
                    >document, then save it, you will see that it's still in Unicode regardless
                    >of the format you have
                    >specified for output. You can see this by looking at the file size. If you
                    >copy the contents of that document and paste into a new document, then
                    >save, you'll see that the file size
                    >
                    >has reduced by 50%, as it should. This really threw me, because my clips
                    >are not designed to run on Unicode documents, so the clip library would
                    >just fail miserably. I was getting Unicode
                    >from OCR because of wider support for symbols.so now, if I receive Unicode
                    >documents, I immediately convert them by copy/paste/save, and go from there.
                    >
                    >
                    >You can
                    >
                    >From: <mailto:ntb-scripts%40yahoogroups.com>ntb-scripts@yahoogroups.com
                    >[mailto:ntb-scripts@yahoogroups.com] On Behalf Of Art Kocsis
                    >Sent: Tuesday, June 03, 2008 6:54 AM
                    >To: <mailto:ntb-scripts%40yahoogroups.com>ntb-scripts@yahoogroups.com
                    >Subject: RE: [NTS] Changing CR & LF
                  • buralex@gmail.com
                    Art Kocsis said on Jun 03, 2008 23:16 -0400 (in ... Use it enough and you ll start trying to enter Regex in the Google toolbar. it
                    Message 9 of 16 , Jun 4, 2008
                    • 0 Attachment
                      Art Kocsis <artkns@...> said on Jun 03, 2008 23:16 -0400 (in
                      part):
                      > RegEx rocks!!! [But I do wish it had easier syntax.]
                      Use it enough and you'll start trying to enter Regex in the Google toolbar.
                      <tip> it doesn't work :-) </tip>

                      More seriously - if you want to get a better handle on Regex syntax
                      RegexBuddy is the way to go. And it has a forum as helpful as the
                      Notetab group of mail lists. (Fortunately not as active)

                      I use RegexBuddy for almost every non-trivial regex I attempt in Notetab
                      then just paste the "correct" expression back into a ^!Find or ^!Replace
                      statement.

                      Regards ... Alec -- buralex-gmail
                      --



                      [Non-text portions of this message have been removed]
                    • Art Kocsis
                      Even though it s kind of embarrassing to display a kludgy clip, I thought I would share this in the hopes that it would inspire other RegEx beginners to learn
                      Message 10 of 16 , Jun 12, 2008
                      • 0 Attachment
                        Even though it's kind of embarrassing to display a kludgy clip, I thought I
                        would share this in the hopes that it would inspire other RegEx beginners
                        to learn and use RegEx.

                        Years ago I got tired of all the empty lines in HTML pages that I was editing
                        (largely due to WYSIWYG editors such as FrontPage), I decided to write a
                        clip to get rid of them. I don't know if this was my first clip or not but
                        it was
                        early. One of my big problems was handling the various line terminators -
                        CR, LF, CRLF - that appeared in the code. I did not learn until last month
                        that they were all converted to CRLF in the working image. Even so, the ^P
                        token did not work consistently so I came up with this scheme. It worked
                        but was quite slow. Finally losing patience with its slowness, I decided to
                        redo the clip using RegEx. As you know, with your help, I was successful.
                        Below, for your amusement/education/motivation are the before & after clips.

                        Three lessons can be learned:

                        1) Even kludges can be made to work and are useful. Keep trying.
                        2) RegEx is quite esoteric yet is conquerable and is extremely efficient.
                        3) We need better documentation, especially a User Guide.

                        Enjoy, Art

                        Note: NoteTab has EXTREMELY picky syntax. When it says "space delimited"
                        that means a SINGLE space - two or more spaces => "syntax error"!

                        ;^!Replace "SearchText" >> "ReplaceText" [Options TCIBGWHRSA]
                        ; W: Whole. Search entire document (not just from the cursor position).
                        ; S: Silent. NoteTab will not display any message box.
                        ; A: All. Replace matched occurrences, not just first one.
                        ;
                        ; "ALT+M" invokes the Modify menu
                        ; "L" invokes the Lines submenu
                        ; "T" trims the selected text


                        ;######### Old, Non-RegEx Clip. Could take 30 sec or more on a 40K file
                        ^!StatusShow Running Single Space
                        ^!Toolbar Select All
                        ^!Keyboard ALT+M L T
                        ^!Replace "^L" >> "^C" WSA
                        ^!Replace "^C^C^C^C^C^C" >> "^C" WSA
                        ^!Replace "^C^C^C^C^C" >> "^C" WSA
                        ^!Replace "^C^C^C^C" >> "^C" WSA
                        ^!Replace "^C^C^C" >> "^C" WSA
                        ^!Replace "^C^C" >> "^C" WSA
                        ^!Replace "^C" >> "^P" WSA
                        ^!Jump Doc_Start


                        ;######### New, RegEx Clip. Takes a fraction of a second on even very large
                        files
                        ^!StatusShow Running Single Space
                        ^!Toolbar Select All
                        ^!Keyboard ALT+M L T
                        ^!Replace "\R+" >> "^%NL%" ARSTW
                        ^!Jump Doc_Start
                      • John Shotsky
                        Art, ^!Replace R+ ^%NL% ARSTW Will change your paragraph spacing, because any paragraph separation lines will disappear. If you want to retain a blank
                        Message 11 of 16 , Jun 12, 2008
                        • 0 Attachment
                          Art,

                          ^!Replace "\R+" >> "^%NL%" ARSTW


                          Will change your paragraph spacing, because any paragraph separation lines will disappear.
                          If you want to retain a blank line between paragraphs, it should be:

                          ^!Replace "\R" >> "^%NL%" ARSTW

                          John

                          From: ntb-scripts@yahoogroups.com [mailto:ntb-scripts@yahoogroups.com] On Behalf Of Art Kocsis
                          Sent: Thursday, June 12, 2008 3:55 PM
                          To: NoteTab-Scripts
                          Subject: Re: [NTS] Changing CR & LF

                          Even though it's kind of embarrassing to display a kludgy clip, I thought I
                          would share this in the hopes that it would inspire other RegEx beginners
                          to learn and use RegEx.

                          Years ago I got tired of all the empty lines in HTML pages that I was editing
                          (largely due to WYSIWYG editors such as FrontPage), I decided to write a
                          clip to get rid of them. I don't know if this was my first clip or not but
                          it was
                          early. One of my big problems was handling the various line terminators -
                          CR, LF, CRLF - that appeared in the code. I did not learn until last month
                          that they were all converted to CRLF in the working image. Even so, the ^P
                          token did not work consistently so I came up with this scheme. It worked
                          but was quite slow. Finally losing patience with its slowness, I decided to
                          redo the clip using RegEx. As you know, with your help, I was successful.
                          Below, for your amusement/education/motivation are the before & after clips.

                          Three lessons can be learned:

                          1) Even kludges can be made to work and are useful. Keep trying.
                          2) RegEx is quite esoteric yet is conquerable and is extremely efficient.
                          3) We need better documentation, especially a User Guide.

                          Enjoy, Art

                          Note: NoteTab has EXTREMELY picky syntax. When it says "space delimited"
                          that means a SINGLE space - two or more spaces => "syntax error"!

                          ;^!Replace "SearchText" >> "ReplaceText" [Options TCIBGWHRSA]
                          ; W: Whole. Search entire document (not just from the cursor position).
                          ; S: Silent. NoteTab will not display any message box.
                          ; A: All. Replace matched occurrences, not just first one.
                          ;
                          ; "ALT+M" invokes the Modify menu
                          ; "L" invokes the Lines submenu
                          ; "T" trims the selected text

                          ;######### Old, Non-RegEx Clip. Could take 30 sec or more on a 40K file
                          ^!StatusShow Running Single Space
                          ^!Toolbar Select All
                          ^!Keyboard ALT+M L T
                          ^!Replace "^L" >> "^C" WSA
                          ^!Replace "^C^C^C^C^C^C" >> "^C" WSA
                          ^!Replace "^C^C^C^C^C" >> "^C" WSA
                          ^!Replace "^C^C^C^C" >> "^C" WSA
                          ^!Replace "^C^C^C" >> "^C" WSA
                          ^!Replace "^C^C" >> "^C" WSA
                          ^!Replace "^C" >> "^P" WSA
                          ^!Jump Doc_Start

                          ;######### New, RegEx Clip. Takes a fraction of a second on even very large
                          files
                          ^!StatusShow Running Single Space
                          ^!Toolbar Select All
                          ^!Keyboard ALT+M L T
                          ^!Replace "\R+" >> "^%NL%" ARSTW
                          ^!Jump Doc_Start



                          [Non-text portions of this message have been removed]
                        • Alec Burgess
                          ... Just a couple of comments Art: As always there are many ways to skin the cat, in Notetab clips. I usually use them is this order: - Native command (looking
                          Message 12 of 16 , Jun 12, 2008
                          • 0 Attachment
                            On Thu, Jun 12, 2008 at 6:54 PM, Art Kocsis <artkns@...> wrote:

                            > ;######### New, RegEx Clip. Takes a fraction of a second on even very large
                            >
                            > files
                            > ^!StatusShow Running Single Space
                            > ^!Toolbar Select All
                            > ^!Keyboard ALT+M L T
                            > ^!Replace "\R+" >> "^%NL%" ARSTW
                            > ^!Jump Doc_Start
                            >

                            Just a couple of comments Art:
                            As always there are many ways to skin the cat, in Notetab clips. I usually
                            use them is this order:

                            - Native command (looking for a native command in Clip Help assists in
                            learning others I may not have even realized exist)
                            - Menu Command
                            - Toolbar Command (actually I never use these - I think ALL Toolbar
                            commands are available from menu.)
                            - Keyboard Command - avoid like the plague - hard to figure out what they
                            do, may require Waits to work correctly, (however they ARE necessary when
                            trying to drive an external window from within Notetab)

                            so:

                            > ^!Toolbar Select All
                            > ^!Keyboard ALT+M L T
                            >
                            would be:

                            > ^!select ALL
                            > ; or optionally ^!Menu Edit/"Select All";
                            > ^!Menu Modify/Lines/Trim Blanks"


                            what say the rest of the regular contributors?
                            --
                            Regards ... Alec
                            --


                            [Non-text portions of this message have been removed]
                          • Art Kocsis
                            Thank you for your interest John, but my clip does exactly what I want to do: change an entire document to single spacing, i.e,, change every instance of one
                            Message 13 of 16 , Jun 12, 2008
                            • 0 Attachment
                              Thank you for your interest John, but my clip does exactly what I want
                              to do: change an entire document to single spacing, i.e,, change every
                              instance of one or more consecutive newlines to a single newline. Your
                              clip is, in essence, a null clip as it just replaces each instance of a
                              newline with a newline. Compare the two expressions operating on a
                              multiply spaced document.

                              Namaste', Art


                              At 6/12/2008 04:02 PM, you wrote:
                              >Art,
                              >
                              >^!Replace "\R+" >> "^%NL%" ARSTW
                              >
                              >Will change your paragraph spacing, because any paragraph separation lines
                              >will disappear.
                              >If you want to retain a blank line between paragraphs, it should be:
                              >
                              >^!Replace "\R" >> "^%NL%" ARSTW
                              >
                              >John
                              >
                              >From: <mailto:ntb-scripts%40yahoogroups.com>ntb-scripts@yahoogroups.com On
                              >Behalf Of Art Kocsis
                              >Sent: Thursday, June 12, 2008 3:55 PM
                              >To: NoteTab-Scripts
                              >Subject: Re: [NTS] Changing CR & LF
                              >
                              >Even though it's kind of embarrassing to display a kludgy clip, I thought I
                              >would share this in the hopes that it would inspire other RegEx beginners
                              >to learn and use RegEx.
                              >
                              >Years ago I got tired of all the empty lines in HTML pages that I was editing
                              >(largely due to WYSIWYG editors such as FrontPage), I decided to write a
                              >clip to get rid of them. I don't know if this was my first clip or not but
                              >it was
                              >early. One of my big problems was handling the various line terminators -
                              >CR, LF, CRLF - that appeared in the code. I did not learn until last month
                              >that they were all converted to CRLF in the working image. Even so, the ^P
                              >token did not work consistently so I came up with this scheme. It worked
                              >but was quite slow. Finally losing patience with its slowness, I decided to
                              >redo the clip using RegEx. As you know, with your help, I was successful.
                              >Below, for your amusement/education/motivation are the before & after clips.
                              >
                              >Three lessons can be learned:
                              >
                              >1) Even kludges can be made to work and are useful. Keep trying.
                              >2) RegEx is quite esoteric yet is conquerable and is extremely efficient.
                              >3) We need better documentation, especially a User Guide.
                              >
                              >Enjoy, Art
                              >
                              >Note: NoteTab has EXTREMELY picky syntax. When it says "space delimited"
                              >that means a SINGLE space - two or more spaces => "syntax error"!
                              <snip>
                              >;######### New, RegEx Clip. Takes a fraction of a second on even very large
                              >files
                              >^!StatusShow Running Single Space
                              >^!Toolbar Select All
                              >^!Keyboard ALT+M L T
                              >^!Replace "\R+" >> "^%NL%" ARSTW
                              >^!Jump Doc_Start
                            • Art Kocsis
                              HI Alec, This says it all: learning others I may not have even realized exist I had originally coded the Trim as: ^!Keyboard ALT+M O &100 L O &100 T But it
                              Message 14 of 16 , Jun 12, 2008
                              • 0 Attachment
                                HI Alec,

                                This says it all: "learning others I may not have even realized exist"

                                I had originally coded the Trim as:

                                ^!Keyboard ALT+M O &100 L O &100 T

                                But it didn't work so I did the

                                ^!Keyboard ALT+M L T

                                which did work, but is esoteric.

                                After learning about ^!Toolbar, I tried as it would be self-documenting:

                                ^!Toolbar Trim Blanks

                                but that didn't work because, as you know, it's not on a toolbar! So
                                I gave up and went back to what worked.

                                So thanks for the tip. I didn't know about the ^!Select ALL or ^!Menu
                                until now. It is much better and is what I had wanted to do all along.

                                Getting back to the Clip Help file it is huge and I find I spend a huge
                                amount of time searching it, even for stuff I have seen before let alone
                                commands or techniques of which I am not aware. There are not
                                enough internal links and the organization is frequently not the way
                                that I think. To find all these tidbits I would need to read the file end
                                to end a few times which would take forever. In the meantime, I am
                                collecting the various command names that I use or that look promising
                                into a single sorted file. Maybe that will help. Thankfully there is this list
                                that reap the benefit of other eyes reading the help file, like

                                "trying to drive an external window from within Notetab" ?? Was ist??

                                Thanks again, Art


                                At 6/12/2008 08:39 PM, you wrote:
                                >On Thu, Jun 12, 2008 at 6:54 PM, Art Kocsis
                                ><<mailto:artkns%40sbcglobal.net>artkns@...> wrote:
                                >
                                > > ;##### New, RegEx Clip. Takes a fraction of a second on even very large
                                > files
                                > >
                                > > ^!StatusShow Running Single Space
                                > > ^!Toolbar Select All
                                > > ^!Keyboard ALT+M L T
                                > > ^!Replace "\R+" >> "^%NL%" ARSTW
                                > > ^!Jump Doc_Start
                                > >
                                >Just a couple of comments Art:
                                >As always there are many ways to skin the cat, in Notetab clips. I usually
                                >use them is this order:
                                >
                                >- Native command (looking for a native command in Clip Help assists in
                                >learning others I may not have even realized exist)
                                >- Menu Command
                                >- Toolbar Command (actually I never use these - I think ALL Toolbar
                                >commands are available from menu.)
                                >- Keyboard Command - avoid like the plague - hard to figure out what they
                                >do, may require Waits to work correctly, (however they ARE necessary when
                                >trying to drive an external window from within Notetab)
                                >
                                >so:
                                >
                                > > ^!Toolbar Select All
                                > > ^!Keyboard ALT+M L T
                                > >
                                >would be:
                                >
                                > > ^!select ALL
                                > > ; or optionally ^!Menu Edit/"Select All";
                                > > ^!Menu Modify/Lines/Trim Blanks"
                                >
                                >what say the rest of the regular contributors?
                                >--
                                >Regards ... Alec
                              Your message has been successfully submitted and would be delivered to recipients shortly.