Loading ...
Sorry, an error occurred while loading the content.

Re: [NTS] Sorting Numbers and No-Sort: Remove Dups/Trips...

Expand Messages
  • Sheri
    Hi Alan, I struggled and finally got the following working -- uses a pm module called Sort::Fields. Sample is first sorted as I think Jody was wanting it
    Message 1 of 26 , Nov 9, 2002
    • 0 Attachment
      Hi Alan,

      I struggled and finally got the following working -- uses a pm module
      called Sort::Fields. Sample is first sorted as I think Jody was
      wanting it (chapter and verse treated as numeric), then the
      duplicates are removed. I think I've found a Perl way of removing
      duplicate lines from an unsorted input, but haven't tried to
      implement it.

      Sheri

      Sample Input:
      ChapterOne 13:15
      ChapterOne 13:3
      ChapterOne 14:3
      ChapterOne 15:11
      ChapterOne 15:13
      ChapterOne 15:18
      ChapterOne 15:5-6
      ChapterOne 15:6
      ChapterOne 15:8
      ChapterOne 16:23
      ChapterOne 16:27
      ChapterOne 16:4
      ChapterOne 17:28
      ChapterOne 18:7
      ChapterOne 120:2
      ChapterTwo 126:2
      ChapterOne 140:3
      ChapterOne 141:3
      ChapterOne 14:3
      ChapterTwo 126:2
      ChapterOne 15:5-6

      Sample Output:
      ChapterOne 13:3
      ChapterOne 13:15
      ChapterOne 14:3
      ChapterOne 15:5-6
      ChapterOne 15:6
      ChapterOne 15:8
      ChapterOne 15:11
      ChapterOne 15:13
      ChapterOne 15:18
      ChapterOne 16:4
      ChapterOne 16:23
      ChapterOne 16:27
      ChapterOne 17:28
      ChapterOne 18:7
      ChapterOne 120:2
      ChapterOne 140:3
      ChapterOne 141:3
      ChapterTwo 126:2

      H="_Perl Field Sorter and Dupe Remover"
      #all lines go in an array (memory intensive)
      @lines = <>; # Read entire file into an array
      use Sort::Fields;
      #first parameter is pattern for field delimiters
      @lines= fieldsort '[\s\:\-]', [1, '2n', '3n'], @lines;
      #next part extracts unique entries in a sorted array
      $prev = 'nonesuch';
      @lines = grep($_ ne $prev && (($prev) = $_), @lines);
      #finally, print array
      print @lines;

      H="Run Perl Sorter and Dupe Remover"
      ^!Jump Doc_Start
      ^!RunPerl Perl Field Sorter and Dupe Remover
    • Sheri
      Here s the Perl non-sorted removal of duplicate lines in an array: H= _Perl Remove Dupes without Sorting @in = ; undef %saw; @out = grep(!$saw{$_}++, @in);
      Message 2 of 26 , Nov 9, 2002
      • 0 Attachment
        Here's the Perl non-sorted removal of duplicate lines in an array:

        H="_Perl Remove Dupes without Sorting"
        @in = <>;
        undef %saw;
        @out = grep(!$saw{$_}++, @in);
        print @out;

        H="Run Perl Remove Dupes without Sorting"
        ^!Jump Doc_Start
        ^!RunPerl Perl Remove Dupes without Sorting

        Same input as previous message gets this output:

        ChapterOne 13:15
        ChapterOne 13:3
        ChapterOne 14:3
        ChapterOne 15:11
        ChapterOne 15:13
        ChapterOne 15:18
        ChapterOne 15:5-6
        ChapterOne 15:6
        ChapterOne 15:8
        ChapterOne 16:23
        ChapterOne 16:27
        ChapterOne 16:4
        ChapterOne 17:28
        ChapterOne 18:7
        ChapterOne 120:2
        ChapterTwo 126:2
        ChapterOne 140:3
        ChapterOne 141:3

        Keep in mind, I found these solutions, but I don't particularly know
        why they work :D
      • Jody
        Hi Harv, ... For my Clips, the user doesn t need to know anything, well, they need to know how to execute a Clip. Once I get it all combined I ll post it
        Message 3 of 26 , Nov 19, 2002
        • 0 Attachment
          Hi Harv,

          >> H="FixedRegExp4Jody"
          >> ^!Replace "{\W+}\s{\d+:\d+}" >> "\2\s\1" AIRSW
          >> ^!Replace "{\d+}:{\d+}" >> "00\1:00\2\s" AIRSW
          >> ^!Replace "\d*{\d\d\d}:\d*{\d\d\d}\s" >> "\1:\2" AIRSW
          >> ^!Select All
          >> ^!InsertText ^$StrSort("^$GetText$";False;True;True)$
          >> ^!Replace "0*{[1-9]\d*}:0*{[1-9]\d*}" >> "\1:\2" AIRSW
          >> ^!Replace "{\d+:\d+}\s{\W+}" >> "\2\s\1" AIRSW

          >With either clip I wind up with a totally sorted, 1,600 line
          >pasteboard file.
          >
          >Jody will be able to figure out what to do but it may be
          >confusing to newer users.

          For my Clips, the user doesn't need to know anything, well, they
          need to know how to execute a Clip. <g> Once I get it all
          combined I'll post it to the list. I still need to add a blank
          line in between the sets of books. I think I can handle that, at
          least 1 out of 3 times. <g>

          So, in short, a word(s) and/or phrases are searched for via
          Ransack. Since there will be duplicates found when searching for
          THIS or THIS (apple + orange) they need to be removed and sorted
          like the Clip above does, in a real numbering method. Then the
          Clip to put it in biblical order is ran, and lastly add the blank
          lines in between groups of books. That will all be done in one
          Clip once I get it all sorted out.

          Thanks much!
          -jody
        • Jody
          Hi Sheri, Hugo... ... Thanks for doing this for me. What I did not tell you so that it makes more sense to you now, is that it was fine if the *books* got out
          Message 4 of 26 , Nov 19, 2002
          • 0 Attachment
            Hi Sheri, Hugo...

            >> If you change one "ChapterOne" in your sample input to
            >> "ChapterTwo", you get the sorting mixed up - I guess because
            >> you followed Jody's wording too much You sort on the numbers
            >> only... Where I sort on the beginning of the lines... I don't
            >> know what Jody really needed.

            Thanks for doing this for me.

            What I did not tell you so that it makes more sense to you now,
            is that it was fine if the *books* got out of order. Actually,
            they already are out of order at this point in the process; just
            in the Windows sorting. I already had the Clip made with the
            array to put the books in Biblical order, so that is why I mainly
            needed the chap:vs sorted. It's just that the chapter:verse
            numbers were in Windows sorting instead of a real numerical
            order.

            I haven't looked at the Perl scripts yet, but hopefully will.
            Then folks can have an option to download the huge Perl engine or
            use NoteTab's code, or should I say Sheri's. ;) If another one
            was made that worked, thanks. Sheri, I have to figure out why
            your sorting clip to remove duplicates bring it up in an outline
            does not change the text in the document. It deletes the doubles
            but sorts the lines. The outline output does not. It is the one
            you were showing the line number thing.

            Anyhow, the following works; I just need to merge it and make
            some adjustments as needed.

            Thanks again!

            >haha, seems I have too many lines in my clip. Comment out the first
            >and last lines and it sorts by Title as well as Chapter:Verse.
            >
            >FixedRegExp4Jody
            >;^!Replace "{\W+}\s{\d+:\d+}" >> "\2\s\1" AIRSW
            >^!Replace "{\d+}:{\d+}" >> "00\1:00\2\s" AIRSW
            >^!Replace "\d*{\d\d\d}:\d*{\d\d\d}\s" >> "\1:\2" AIRSW
            >^!Select All
            >^!InsertText ^$StrSort("^$GetText$";False;True;True)$
            >^!Replace "0*{[1-9]\d*}:0*{[1-9]\d*}" >> "\1:\2" AIRSW
            >;^!Replace "{\d+:\d+}\s{\W+}" >> "\2\s\1" AIRSW
            >;end of clip


            Happy Script'n!
            Jody Adair

            UnSubscribe, Options
            mailto:ntb-Scripts-UnSubscribe@yahoogroups.com
            http://groups.yahoo.com/group/ntb-scripts

            The NoteTabbers Assistant Page
            http://www.notetab.net
          • Sheri
            ... Hi Jody, Welcome back :) Are describing something it should do but isn t or something it does that you don t understand? This has been awhile ago, so bear
            Message 5 of 26 , Nov 19, 2002
            • 0 Attachment
              --- In ntb-scripts@y..., Jody <av1611@e...> wrote:
              > Sheri, I have to figure out why
              > your sorting clip to remove duplicates bring it up in an outline
              > does not change the text in the document. It deletes the doubles
              > but sorts the lines. The outline output does not. It is the one
              > you were showing the line number thing.

              Hi Jody,

              Welcome back :)

              Are describing something it should do but isn't or something it does
              that you don't understand?

              This has been awhile ago, so bear with me. I do recall that that clip
              gave two options for how the output would look. If you want one of
              each you have to run it twice and save the two variations.

              Regards,
              Sheri
            • Jody
              Hi Sheri, ... Thx! ... I peeked at it again. One problem was you did not turn word wrap off in the copy you made. I have yet to figure out why you made
              Message 6 of 26 , Nov 20, 2002
              • 0 Attachment
                Hi Sheri,

                >> Sheri, I have to figure out why your sorting clip to remove
                >> duplicates bring it up in an outline does not change the text
                >> in the document. It deletes the doubles but sorts the lines.
                >> The outline output does not. It is the one you were showing the
                >> line number thing.
                >
                >Welcome back :)

                Thx!

                >Are describing something it should do but isn't or something it
                >does that you don't understand?

                I peeked at it again. One problem was you did not turn word wrap
                off in the copy you made. I have yet to figure out why you made
                temp.otl, but the text used as headers is correct, so I just need
                to, oh, look at it some more. <g> I think I know what you were
                doing off the top of my head, pre-guessing "you."

                Thanks for your help. I'm pretty sure I can take it from here...

                What I do not understand is when the lines are numbered appending
                the number to the end of the file how that can remove dups w/o
                sorting. Each line ends differently. Oh, I think it is coming
                back now, hold on, yes... StrSize out to the end of the line w/o
                the number. :) Claes G., Michael G., or Wayne VW showed me how
                to do that a number of years ago. I know I have the Clip
                somewhere amongst my zillions of them. <g>

                If I remember correctly, I think your short RegExp Clip does it all correctly, but I will certainly learn again the NoSort, RemDups.

                Happy Script'n!
                Jody Adair

                UnSubscribe, Options
                mailto:ntb-Scripts-UnSubscribe@yahoogroups.com
                http://groups.yahoo.com/group/ntb-scripts

                The NoteTabbers Assistant Page
                http://www.notetab.net
              • Sheri
                ... The sample I used had such short lines, word-wrap never occured to me :) ... temp.otl sounds like something I should have deleted at the end of the clip.
                Message 7 of 26 , Nov 20, 2002
                • 0 Attachment
                  --- In ntb-scripts@y..., Jody <av1611@e...> wrote:

                  > I peeked at it again. One problem was you did not turn word wrap
                  > off in the copy you made. I have yet to figure out why you made
                  > temp.otl, but the text used as headers is correct, so I just need
                  > to, oh, look at it some more. <g> I think I know what you were
                  > doing off the top of my head, pre-guessing "you."
                  >

                  The sample I used had such short lines, word-wrap never occured to
                  me :)

                  > What I do not understand is when the lines are numbered appending
                  > the number to the end of the file how that can remove dups w/o
                  > sorting. Each line ends differently. Oh, I think it is coming
                  > back now, hold on, yes... StrSize out to the end of the line w/o
                  > the number. :) Claes G., Michael G., or Wayne VW showed me how
                  > to do that a number of years ago. I know I have the Clip
                  > somewhere amongst my zillions of them. <g>

                  temp.otl sounds like something I should have deleted at the end of
                  the clip. It's coming back to me now. I saved the line numbers in an
                  outline, sorted the original file to remove duplicates, then
                  retrieved the line numbers from the outline so I could re-sort back
                  to the order it had prior to duplicates-removing sort. I doubt I used
                  any StrSize type tests.

                  Regards,
                  Sheri
                • Jody
                  Hi Sheri, ... Oh, I got it. I understand the OTL now and why you made the full line the headers. That is an interesting method. I haven t seen it
                  Message 8 of 26 , Nov 20, 2002
                  • 0 Attachment
                    Hi Sheri,

                    >temp.otl sounds like something I should have deleted at the end
                    >of the clip. It's coming back to me now. I saved the line
                    >numbers in an outline, sorted the original file to remove
                    >duplicates, then retrieved the line numbers from the outline so I
                    >could re-sort back to the order it had prior to duplicates-
                    >removing sort.

                    Oh, I got it. I understand the OTL now and why you made the
                    full line the headers. <g> That is an interesting method. I
                    haven't seen it before.

                    >I doubt I used any StrSize type tests.

                    I had only glanced before. You did use it, but not for sorting;
                    instead for StrFill of the zeros.

                    ^!Select Line
                    ^!Set %Row%=^$GetRow$
                    ^!InsertSelect ^$StrFill(0;^$Calc(6-^$StrSize(^%Row%)$)$)$^%Row%#^$GetSelection$

                    Thanks for your time... ;)

                    See ya in the funnies!
                    Jody

                    ...he that is of a merry heart hath a continual feast...
                    http://www.clean-funnies.com

                    If you haven't laughed at yourself today,
                    you missed a good joke!
                  • Sheri
                    ... #^$GetSelection$ ... Hahaha I meant to ask you if I was allowing for enough line numbers lol. Regards, Sheri
                    Message 9 of 26 , Nov 20, 2002
                    • 0 Attachment
                      --- In ntb-scripts@y..., Jody <av1611@e...> wrote:

                      > ^!Select Line
                      > ^!Set %Row%=^$GetRow$
                      > ^!InsertSelect ^$StrFill(0;^$Calc(6-^$StrSize(^%Row%)$)$)$^%Row%
                      #^$GetSelection$
                      >

                      Hahaha

                      I meant to ask you if I was allowing for enough line numbers lol.

                      Regards,
                      Sheri
                    • Jody
                      Hi Sheri, I found a minor bug in your RegExp, sorting script. I ll post it a bit later once I get the rest of it going - but it might be awhile till I can get
                      Message 10 of 26 , Nov 22, 2002
                      • 0 Attachment
                        Hi Sheri,

                        I found a minor bug in your RegExp, sorting script. I'll post it
                        a bit later once I get the rest of it going - but it might be
                        awhile till I can get to it. Getting about 100-150 support
                        messages a day since the newsletters went out.

                        Found it quick enough. ;) I think it was the line above
                        the commented one needed to be split so that it did on the
                        left side of the colon first and then the right side. The
                        way it was it was not stripping the zeros from the right
                        side in front of numbers: :001...009. As in say Psalms 150:002,
                        Psalms 150:006 - should be 150:1[-6] Thanks again, Sheri. ;)
                        I'll have to find some time latter to crank up the Perl scripts
                        offered.

                        FixedRegExp4Jody
                        ;^!Replace "{\W+}\s{\d+:\d+}" >> "\2\s\1" AIRSW
                        ^!Replace "{\d+}:{\d+}" >> "00\1:00\2\s" AIRSW
                        ^!Replace "\d*{\d\d\d}:\d*{\d\d\d}\s" >> "\1:\2" AIRSW
                        ^!Select All
                        ^!InsertText ^$StrSort("^$GetText$";False;True;True)$
                        ^!Replace "0*{[1-9]\d*}:0*{[1-9]\d*}" >> "\1:\2" AIRSW
                        ;^!Replace "{\d+:\d+}\s{\W+}" >> "\2\s\1" AIRSW

                        Reorder Chap:Verse
                        ^!Replace "{\w+}\s{\d+:\d+}" >> "\2\s\1" AIRSW
                        ^!Replace "{\d+}:{\d+}" >> "00\1:00\2\s" AIRSW
                        ^!Replace "\d*{\d\d\d}:\d*{\d\d\d}\s" >> "\1:\2" AIRSW
                        ^!Select All
                        ^$StrSort("^$GetText$";0;1;1)$
                        ^!Replace ":0+" >> ":" AIRSW
                        ^!Replace "0+{[1-9]\d*}:" >> "\1:" AIRSW

                        Happy Script'n!
                        Jody Adair

                        UnSubscribe, Options
                        mailto:ntb-Scripts-UnSubscribe@yahoogroups.com
                        http://groups.yahoo.com/group/ntb-scripts

                        The NoteTabbers Assistant Page
                        http://www.notetab.net
                      Your message has been successfully submitted and would be delivered to recipients shortly.