Loading ...
Sorry, an error occurred while loading the content.
 

sorting lists with a clip

Expand Messages
  • Mike Breiding - Morgantown WV
    Not sure if this is clipable or not. I have two email lists. I would like to compare the two and end of with a report of addresses which are different in each
    Message 1 of 9 , Aug 5, 2007
      Not sure if this is clipable or not.

      I have two email lists.
      I would like to compare the two and end of with a report of addresses which
      are different in each list.

      In other words:
      List 1 has these emails not in list 2
      List 2 has these emails not in list 1

      Do-able?

      Thanks,
      -Mike

      ~~~ PLEASE NOTE ~~~

      My new Email address is: mike@...

      The spammers found me at mike@...
      I have received over 3000 junk mails since January 2007 and the volume is
      growing daily.




      [Non-text portions of this message have been removed]
    • buralex@gmail.com
      Mike Breiding - Morgantown WV said on Aug ... ANYTHING is clipable - you mean will someone do it for me :-D ... Load list 1 into
      Message 2 of 9 , Aug 7, 2007
        Mike Breiding - Morgantown WV <notetab@...> said on Aug
        05, 2007 16:58 -0400 (in part):
        > Not sure if this is clipable or not.
        ANYTHING is clipable - you mean will someone do it for me :-D
        >
        > I have two email lists.
        > I would like to compare the two and end of with a report of addresses
        > which
        > are different in each list.
        >
        > In other words:
        > List 1 has these emails not in list 2
        > List 2 has these emails not in list 1
        Load list 1 into "NoName01" and list 2 into "NoName02"
        --- List 1 (sample) ---
        line 9 - twice
        line 9 - twice
        line 8

        line 7
        line 6
        line 5.1 only
        line 4


        line 3.1 only
        line 2
        line 10
        line 1
        --- List 2 (sample) ---

        line 1
        line 10
        line 2
        line 3.2 only
        line 4
        line 5.2 only
        line 6
        line 7
        line 8 - 3 of them (scattered)
        line 9
        line 8 - 3 of them (scattered)
        line 8 - 3 of them (scattered)
        ------ gives this output ----
        **** @@@@...@@@@ ****
        line 3.1 only
        line 5.1 only
        line 8
        **** @@@@...@@@@ ****
        line 3.2 only
        line 5.2 only
        line 8 - 3 of them (scattered)
        line 9
        -------------------------------

        Here's the clip - as always watch out for long lines:

        H=Find unmatched lines
        ;Alec Burgess 2007-08-07 04:47:16 (Aug-Tue)
        ; in two files "NoName01.txt" and "NoName02.txt"

        ^!clearvariables

        ^!set %file_1%=NoName01.txt
        ^!set %file_2%=NoName02.txt
        ^!set %file_results%=results.txt
        ^!set %uniqueSuffix%="@@@@"

        ; ready results file
        ^!DestroyDoc ^%file_results%
        ^!Menu File/New
        ^!RenameDoc ^%file_results%

        ; append uniqueSuffix that will not appear
        ; anywere else in either file
        ; sort the file removing duplicates and
        ; append to results.txt

        ; for i=1 to 2 step 1
        ^!set i=1
        :for_loop
        ^!set %currFile%=^%file_^%i%%
        ^!open ^%currFile%
        ^!set %currSuffix%=^%uniqueSuffix%^%currFile%^%uniqueSuffix%

        ; we need these later
        ^!set %stamp^%i%%=^%currSuffix%

        ;just in case we're running it again while debugging
        ^!replace "^%currSuffix%$" >> "" rwsai

        ; append identifier and sort
        ^!replace "$" >> "^%currSuffix%" rwsai
        ^!menu modify/lines/sort/ascending

        ;remove all but one occurance of duplicates in curr file
        ^!replace "(.*$\r\n)(\1)+" >> "$1" rwsai

        ^!select ALL
        ^!set %currContents%=^$getselection$
        ^!SetDocIndex ^$GetDocIndex(^%file_results%)$
        ^!jump TEXT_END
        ^%currContents%

        ^!inc i
        ^!if ^%i%<=2 for_loop
        :end_for_loop

        ; should already be focused but just-in-case :-)
        ^!SetDocIndex ^$GetDocIndex(^%file_results%)$
        ^!menu modify/lines/sort/ascending

        ;leave only unique lines
        ^!replace "(.*^%uniqueSuffix%)(.*$)(\r\n)(\1)(.*)\r\n" >> "" rwsai

        ;move suffix to prefix and sort them apart
        ^!replace "(.*?)(^%uniqueSuffix%.*)" >> "$2$1" rwsai
        ^!menu modify/lines/sort/ascending

        ; stamp each section with header
        :Stamp_1
        ^!jump TEXT_START
        ^!replace "(^%stamp1%)" >> "**** $1 ****\r\n$1" rsi
        ^!iferror Next else Stamp_2
        **** ^%stamp1% ****
        :Stamp_2
        ^!replace "(^%stamp2%)" >> "**** $1 ****\r\n$1" rsi
        ^!iferror Next else Stamp_boundary
        ^!jump TEXT_END
        **** ^%stamp2% ****

        :Stamp_boundary
        ;cleanup leading stamps
        ^!replace "(^(^%stamp1%|^%stamp2%))" >> "" rwsai

        Regards ... Alec -- buralex-gmail
        --



        [Non-text portions of this message have been removed]
      • Flo
        ... Alec, When testing your clip with the sample data the result I get is a little bit different... **** @@@@NoName01.txt@@@@ **** line 3.1 only line 5.1 only
        Message 3 of 9 , Aug 7, 2007
          --- In ntb-clips@yahoogroups.com, buralex@... wrote:
          > Load list 1 into "NoName01" and list 2 into "NoName02"
          > --- List 1 (sample) ---...

          Alec,

          When testing your clip with the sample data the result I get is a
          little bit different...

          **** @@@@...@@@@ ****
          line 3.1 only
          line 5.1 only
          line 8
          line 9 - twice <---- not in your output
          **** @@@@...@@@@ ****
          line 3.2 only
          line 5.2 only
          line 8 - 3 of them (scattered)
          line 9

          Anyway, I think this is an interesting approach and it's a good
          lesson in clip programming. The problem is that we don't get a list
          of complete e-mails but a list of alphabetically sorted lines.

          Let's test it with the following files (representing "e-mails")...

          (NoName01.txt:)

          Young Roger came tapping at Dolly's window,
          Thumpaty, thumpaty, thump!
          He asked for admittance; she answered him "No!"
          Frumpaty, frumpaty, frump!

          (NoName02.txt:)

          He asked for admittance; she answered him "No!"
          Frumpaty, frumpaty, frump!
          "No, no, Roger, no! as you came you may go!"
          Stumpaty, stumpaty, stump!

          When running your clip on these two files the result is...

          **** @@@@...@@@@ ****
          Thumpaty, thumpaty, thump!
          Young Roger came tapping at Dolly's window,
          **** @@@@...@@@@ ****
          "No, no, Roger, no! as you came you may go!"
          Stumpaty, stumpaty, stump!

          That's correct! Line 3 and 4 in NoName01.txt don't show up since they
          are contained in NoName02.txt as well. But the first lines are turned
          around. So when comparing two lists of some hundred e-mails we'll get
          a rather chaotic output :-(.

          By the way...

          I tried the following solution. It's a little bit shorter but it
          achieves the same result (with my sample text). Start with
          NoName01.txt opened and select NoName02.txt when prompted...


          ^!Set %SecondFile%=^?[(T=O;F="Textfiles (*.txt)|*.txt")Choose Second
          File:]
          ^!Replace "$" >> "_1" AWRS
          ^!Jump Doc_End
          ^!InsertText ^P^$GetFileText("^%SecondFile%")$
          ^!Select All
          ^$StrSort("^$GetSelection$";0;1;0)$
          ^!Replace "^([^\r\n]+)\r\n\1_1\r\n" >> "" AWRS
          ^!SetListDelimiter ^%NL%
          ^!Set %List1%=^$GetDocMatchAll("^[^\r\n]+_1$")$
          ^!Replace "^([^\r\n]+)?_1(\r\n|\z)" >> "" AWRS
          ^!Jump 1
          ^!InsertText Lines in File 2 but not in List 1^P
          ^!InsertText ---------------------------------^P
          ^!Toolbar New Document
          ^!InsertText Lines in File 1 but not in List 2^P
          ^!InsertText ---------------------------------^P
          ^!InsertText ^%List1%
          ^!Replace "_1$" >> "" AWRS


          I think a solution that provides two lists with complete e-mails
          should be based on selected elements like address, subject, date, and
          time. Two or more e-mails are regarded as duplicates if they are
          identical in these fields. E-Mails that have no duplicates are
          separated and exported into two different files etc...

          Certainly, this could be programmed with a NT5-clip. But I wonder if
          it's worth the effort. I have done that job quite often by importing
          e-mails into a text database that compares the e-mails on basis of
          field contents.

          Regards,
          Flo
           
        • buralex@gmail.com
          ... I think each list is of email-addresses ONLY not email-headers+bodies. I thought about changing the examples to make it clear they were supposed to be
          Message 4 of 9 , Aug 7, 2007
            "Flo" <flo.gehrke@...> said on Aug 07, 2007 8:52 -0400 (in part):
            > --- In ntb-clips@yahoogroups.com, buralex@... wrote:
            >
            >> Load list 1 into "NoName01" and list 2 into "NoName02"
            >> --- List 1 (sample) ---...
            >>
            >
            >
            Flo: Mike said in his original post:
            > I have two email lists.
            > I would like to compare the two and end of with a report of addresses which
            > are different in each list.
            I think each list is of email-addresses ONLY not email-headers+bodies. I
            thought about changing the examples to make it clear they were
            "supposed" to be email addresses but forgot to do so before sending. It
            was late ;-)

            I guess we have to wait for Mike to clarify.

            You are (I think) solving a different (and more complex) problem. For
            yours I would probably use CompareIt! (Not freeware but has a no-nags
            interface that only restricts editing in place - for that use WinMerge).
            Its also the only compare program I've seen that will actually locate
            blocks of text that have been relocated, which I believe would solve the
            problem you describe. (ie. compare two Thunderbird email folder-files
            which each contain many individual not quite identical emails.)

            I have a monthly job that involves comparing two lists (in this case of
            URLs). Up to now I've normally done it by using WinMerge to compare the
            two lists, then using its Tool-Patch option to generate a diff-list and
            a quickie clip to throwaway all but the NEW in List_A. (The others I
            don't care about.) I'll use this list at EOM to see if it works as
            quickly as I expect.

            Regards ... Alec -- buralex-gmail
            --



            [Non-text portions of this message have been removed]
          • WV-Mike
            ... Yep - you got it! I have to leave the heavy thinking up to you all. I appreciate everyone s help with this. I think I am going to try WinMerge as Alec
            Message 5 of 9 , Aug 9, 2007
              At 05:02 AM 8/7/2007 , you wrote:

              >Mike Breiding - Morgantown WV
              ><<mailto:notetab%40EpicRoadTrips.us>notetab@...> said on Aug
              >05, 2007 16:58 -0400 (in part):
              > > Not sure if this is clipable or not.
              >ANYTHING is clipable - you mean will someone do it for me :-D

              Yep - you got it!
              I have to leave the heavy thinking up to you all.

              I appreciate everyone's help with this.

              I think I am going to try WinMerge as Alec suggested and see if that does
              the trick.

              Thanks again,
              -mike



              ----------
              Check it out:
              www.EpicRoadTrips.us

              ~~~



              ~~~





              [Non-text portions of this message have been removed]
            • Don - HtmlFixIt.com
              ... Mike, you took it like a sport. I might have been offended even with the happy face in there. I think it was a fun poke in reality, but email is so hard
              Message 6 of 9 , Aug 9, 2007
                WV-Mike wrote:
                > At 05:02 AM 8/7/2007 , you wrote:
                >
                >> Mike Breiding - Morgantown WV
                >> <<mailto:notetab%40EpicRoadTrips.us>notetab@...> said on Aug
                >> 05, 2007 16:58 -0400 (in part):
                >>> Not sure if this is clipable or not.
                >> ANYTHING is clipable - you mean will someone do it for me :-D
                >
                > Yep - you got it!
                > I have to leave the heavy thinking up to you all.
                >
                > I appreciate everyone's help with this.
                >
                > I think I am going to try WinMerge as Alec suggested and see if that does
                > the trick.
                >
                > Thanks again,
                > -mike
                >
                Mike, you took it like a sport. I might have been offended even with
                the happy face in there. I think it was a fun poke in reality, but
                email is so hard to decipher in nuance sometimes.

                When one first starts clipping, it is difficult to figure out what
                direction to take sometimes. For those of us -- and I bet there are few
                of us who have not benefited from the work and contributions of another
                -- who have experience sometimes the daunting appears easy and yet I
                still ask for input regularly. I cannot tell you how many times my long
                complicated clip has been boiled down to three lines with collective
                reason. REGEX still confuses me and yet I now use it thanks to list
                members "do[ing] it for me" and then explaining (more than once) what
                they just did. So

                *for anyone lurking out there,*

                don't be afraid to ask for someone to do it for you or make
                suggestions and explain later what they did. Take the time to then
                understand even a little how they did it if you can and begin the fun
                process of learning clips.

                So my question of the day that relates, as it is string sorting ...

                From help:
                "^$StrSort("Str";CaseSensitive;Ascending;RemoveDuplicates)$ (added in v4.52)
                Returns the specified text "Str" sorted according to the defined
                criteria. NoteTab 5 supports two new values for the CaseSensitive
                parameter: ANSI to enforce a case sensitive dictionary-type sorting
                order and False_ANSI to ignore character case during sorting. The ANSI
                option produces a sorting order that matches the result from the
                Modify/Lines/Sort menu command. Note, however, that sorting is much
                slower with the ANSI option.
                For example, the following instruction will sort the lines contained in
                the Clipboard, in ascending order, ignoring character case, and removing
                duplicates:
                ^$StrSort("^$GetClipboard$";False;True;True)$"

                Jody (what a great notetab man) would often use variables in those
                true/false slots. He might ask yes/no, so it appears in addition to
                true/false you can also use yes/no. Others use 1/0 for those switches.
                I was not aware of yes/no (or had forgotten) until looking at
                noteblock library yesterday.

                I find nothing in help that describes the possible true/false
                indicators. Are there others? Are there compatibility issues with any
                of the three I mentioned here? Do they all actually work in current
                notetab?

                There is my question for the day ... and yes I want someone else to
                answer it ... :-D

                Thank you all,

                Don
              • Flo
                Alec, Thanks for that hints! If we are dealing with two lists of e-mail addresses only I would propose the following clip that combines some elements of our
                Message 7 of 9 , Aug 9, 2007
                  Alec,

                  Thanks for that hints! If we are dealing with two lists of e-mail
                  addresses only I would propose the following clip that combines some
                  elements of our two clips. It works with any kind of data in a list
                  format as well (like word lists for example). I've tested it with
                  15,000 entries. The job is done within a second and without any
                  fault -- so far ;-)

                  When tested with...

                  list1.txt

                  anthony@...
                  bertha@...
                  carla@...

                  list2.txt

                  bertha@...
                  carla@...
                  donald@...

                  ...the output will be:

                  Contained in list1.txt but not in list2.txt
                  anthony@...

                  Contained in list2.txt but not in list1.txt
                  donald@...

                  Regards
                  Flo

                  (Watch line breaks!)

                  H=Comparing two lists
                  ^!Set %File1%=^?{(T=O;F="Textfiles (*.txt)|*.txt")Choose first
                  file:==^%File1%}; %File2%=^?{(T=O;F="Textfiles (*.txt)|*.txt")Choose
                  second file:==^%File2%}
                  ^!SetScreenUpdate Off
                  ^!Toolbar New Document
                  ^!InsertText ^$GetFileText("^%File1%")$
                  ; Mark all addresses from File1 with a trailing #1
                  ^!Replace "(.)$" >> "$1#1" AWRS
                  ^!Jump Doc_End
                  ^!InsertText ^P^$GetFileText("^%File2%")$
                  ; Remove empty lines at any position
                  ^!Replace "\r\n(?=\r\n)|\A\r\n|\r\n\z" >> "" AWRS
                  ^!Select All
                  ^$StrSort("^$GetSelection$";0;1;0)$
                  ; Reduce two or more duplicates to one
                  ^!Replace "(^[^\r\n]+\r\n)(\1)+" >> "$1" AWRS
                  ; Remove addresses existing in both files
                  ^!Replace "^([^\r\n]+)\r\n\1#1(\r\n|\z)" >> "" AWRS
                  ; Select addresses that exist in File1 only
                  ^!SetListDelimiter ^%NL%
                  ^!Set %List1%=^$GetDocMatchAll("^[^\r\n]+#1$")$
                  ; Remove adresses that exist in File1 only
                  ^!Replace "^([^\r\n]+)#1(\r\n|\z)" >> "" AWRS
                  ^!Jump 1
                  ^!InsertText Contained in ^$GetFileName(^%File2%)$ but not in
                  ^$GetFileName(^%File1%)$^P
                  ^!Toolbar New Document
                  ^!InsertText Contained in ^$GetFileName(^%File1%)$ but not in
                  ^$GetFileName(^%File2%)$^P
                  ^!InsertText ^%List1%
                  ^!Replace "#1$" >> "" AWRS
                • Flo
                  ... Don, I couldn t find it in the help file. But testing shows that yes/1/true and no/0/false have the same effect in NT5. By the way: I think one has to
                  Message 8 of 9 , Aug 10, 2007
                    --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...>
                    wrote:
                    > So my question of the day that relates, as it is string sorting ...

                    Don,

                    I couldn't find it in the help file. But testing shows
                    that "yes/1/true" and "no/0/false" have the same effect in NT5.

                    By the way: I think one has to thoroughly watch the effect of the
                    different Case Parameters.

                    For example, with...

                    ^!Select All
                    ^$StrSort("^$GetSelection$";0;1;0)$

                    ...we get:

                    cat
                    dog
                    doggy
                    dog_1
                    mouse

                    With ^$StrSort("^$GetSelection$";1;1;0)$ the result is...

                    cat
                    dog
                    dog_1
                    doggy
                    mouse

                    Obviously, this difference has nothing to do with the case.

                    Unless I'm very much mistaken, "case sensitive yes" also produces a
                    (hidden) lexical sorting. That's why "dog" is followed by "dog_1" in
                    the second result I suppose. So it's not only the ANSI Parameter,
                    that - according with the help file - "enforce(s) a case sensitive
                    dictionary-type sorting order".

                    There are also differences between ^$StrSort$ and Modify| Lines |
                    Sort (or Toolbar Sort Ascending). The Menu Command is controlled by
                    the setting in Options | Tools I suppose. When choosing "Case
                    Sensitive Sorting" (yes) in the Options, for example, the
                    Menu/Toolbar Command produces...

                    anthony
                    Anthony
                    bertha
                    Bertha

                    With ^$StrSort("^$GetSelection$";1;1;0)$ we get...

                    Anthony
                    Bertha
                    anthony
                    bertha

                    To sum it up, I would strongly recommend to test any sorting very
                    carefully before using it in a clip.

                    Regards,
                    Flo
                  • Don - HtmlFixIt.com
                    ... Flo, You and Sherri take the intelligence cake around here! Your answers are always very helpful. I noticed you used 0/1 in your example clip on sorting
                    Message 9 of 9 , Aug 10, 2007
                      Flo wrote:
                      > --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...>
                      > wrote:
                      >> So my question of the day that relates, as it is string sorting ...
                      >
                      > Don,
                      >
                      > I couldn't find it in the help file. But testing shows
                      > that "yes/1/true" and "no/0/false" have the same effect in NT5.
                      >
                      > By the way: I think one has to thoroughly watch the effect of the
                      > different Case Parameters.
                      >
                      Flo,

                      You and Sherri take the intelligence cake around here! Your answers are
                      always very helpful. I noticed you used 0/1 in your example clip on
                      sorting yesterday. I find the sorting things you did (and the "off"
                      results of the various combinations) to be most interesting. This
                      should really be documented. I am hopeful that perhaps Eric will see
                      your post and give that issue some thought as it doesn't seem to me that
                      we should be getting different results under each of those
                      circumstances. Some of them obviously are giving different directions,
                      but it appears if you give the same instruction to the menu vs strsort,
                      you get differing results. Most interesting and I thank you immensely
                      for your time.

                      Don
                    Your message has been successfully submitted and would be delivered to recipients shortly.