Loading ...
Sorry, an error occurred while loading the content.

Re: help with a little project ...

Expand Messages
  • joy8388608
    Here is some code to help/get you started... As was said, you really should sort the data first by converting dates to yyyy-mm-dd then sorting. This code will
    Message 1 of 9 , Jul 21, 2013
    • 0 Attachment
      Here is some code to help/get you started...

      As was said, you really should sort the data first by converting dates to yyyy-mm-dd then sorting.

      This code will select each block of duplicates and tell you how many there are. It will not select (report on) any lines where the first 12 chars are unique. That is, only 2 or more lines with the same first 12 chars are found. Not sure if this is possible with your data.

      To find the elapsed time, you would have to isolate the time in the last selected line and subtract it from the time in the first selected line. Easiest would be to convert both to seconds and then subtract.

      Hope this quick help helps.

      Joy

      ^!Jump TEXT_START
      :LT
      ; Find and select next group of 2 or more lines that have the same
      ; chars in cols 1-12
      ^!Find "^(.{12}).*\R(\1.*\R)+" R
      ; Stop when no matches found
      ^!IfError END
      ; Tell how many lines selected
      ^!Prompt ^$StrCount("^%NL%";"^$GetSelection$";I)$
      ^!Goto LT



      --- In ntb-clips@yahoogroups.com, Don <don@...> wrote:
      >
      > I have the following type of data:
      > 058001e58517 07-13-13 10:05:22
      > 058001e58517 07-13-13 10:05:22
      > 058001e58573 07-13-13 09:39:54
      > 058001e58573 07-13-13 09:39:54
      > 058001e5861a 07-13-13 09:53:40

      -------- snip -----------

      > 058001e5a812 07-13-13 10:17:39
      > 058001e5a812 07-13-13 10:17:39
      > 058001e5a812 07-13-13 10:17:39
      > 058001e5a812 07-13-13 10:17:39
      > 058001e5a831 07-13-13 10:42:48
      >
      > I want to know how many times each number on the left appears. In a
      > perfect world I would also calculate the amount of time from the first view of that item to the last (times on left). I have a block on how to start even ...
      >
    • Don
      Very much so. Easy to redo the date as suggested since I make this with a regex from raw data. Never even occurred to me to format that way. It is possible
      Message 2 of 9 , Jul 21, 2013
      • 0 Attachment
        Very much so. Easy to redo the date as suggested since I make this with
        a regex from raw data. Never even occurred to me to format that way.

        It is possible to have a "single" appearance of the initial code so I'll
        have a look at that.

        On 7/21/2013 12:41 PM, joy8388608 wrote:
        > Here is some code to help/get you started...
        >
      • flo.gehrke
        ... This clip inserts the frequency at the beginning of each line. Also single occurrences are marked up: ^!Jump Doc_Start ... ^!Find ^( w+).+( R 1.+)* RS
        Message 3 of 9 , Jul 22, 2013
        • 0 Attachment
          --- In ntb-clips@yahoogroups.com, Don <don@...> wrote:
          >
          > It is possible to have a "single" appearance of the initial code
          > so I'll have a look at that.

          This clip inserts the frequency at the beginning of each line. Also single occurrences are marked up:

          ^!Jump Doc_Start
          :Loop
          ^!Find "^(\w+).+(\R\1.+)*" RS
          ^!IfError End
          ^!Set %Freq%=^$StrCount("^\d+";"^$GetSelection$";R)$
          ^!Replace "^" >> "^%Freq%:\x20" HRAS
          ^!Goto Loop

          Regards,
          Flo
        • flo.gehrke
          ... Joy informed me that the RegEx in this clip will fail in case the list has got 333 or more lines. Sorry, I didn t expect Don s list to be that long. For
          Message 4 of 9 , Jul 24, 2013
          • 0 Attachment
            --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
            >
            > This clip inserts the frequency at the beginning of each line. Also single occurrences are marked up:
            >
            > ^!Jump Doc_Start
            > :Loop
            > ^!Find "^(\w+).+(\R\1.+)*" RS
            > ^!IfError End
            > ^!Set %Freq%=^$StrCount("^\d+";"^$GetSelection$";R)$
            > ^!Replace "^" >> "^%Freq%:\x20" HRAS
            > ^!Goto Loop

            Joy informed me that the RegEx in this clip will fail in case the list has got 333 or more lines.

            Sorry, I didn't expect Don's list to be that long.

            For me, it works when using a Possessive Quantifier: '^(\w+).+(\R\1.+)*+' or '^(\w+).+(\R\1.++)*'.

            Probably, that problem is caused by a PCRE Error Recursion Limit. For more details, please see what I posted to the Clip Group on 6/20/2012 with message #22824.

            Regards,
            Flo
          • flo.gehrke
            ... No, sorry -- it s not the size of the whole list. It pertains to the size of the groups (i.e. duplicate 12-character-strings at start of line) that are
            Message 5 of 9 , Jul 24, 2013
            • 0 Attachment
              --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
              >
              > Joy informed me that the RegEx in this clip will fail in case the
              > list has got 333 or more lines.

              No, sorry -- it's not the size of the whole list. It pertains to the size of the groups (i.e. duplicate 12-character-strings at start of line) that are matched with back reference...

              Flo
            • Don
              ... Truly fascinating. My file will often have thousands of lines actually ... for a long time.
              Message 6 of 9 , Jul 24, 2013
              • 0 Attachment
                On 7/24/2013 5:26 PM, flo.gehrke wrote:
                > --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                >>
                >> This clip inserts the frequency at the beginning of each line. Also single occurrences are marked up:
                >>
                >> ^!Jump Doc_Start
                >> :Loop
                >> ^!Find "^(\w+).+(\R\1.+)*" RS
                >> ^!IfError End
                >> ^!Set %Freq%=^$StrCount("^\d+";"^$GetSelection$";R)$
                >> ^!Replace "^" >> "^%Freq%:\x20" HRAS
                >> ^!Goto Loop
                >
                > Joy informed me that the RegEx in this clip will fail in case the list has got 333 or more lines.
                >
                > Sorry, I didn't expect Don's list to be that long.
                >
                > For me, it works when using a Possessive Quantifier: '^(\w+).+(\R\1.+)*+' or '^(\w+).+(\R\1.++)*'.
                >
                > Probably, that problem is caused by a PCRE Error Recursion Limit. For more details, please see what I posted to the Clip Group on 6/20/2012 with message #22824.
                >
                > Regards,
                > Flo

                Truly fascinating. My file will often have thousands of lines actually
                :-) so good you knew the solution. I would have been scratching my head
                for a long time.
              Your message has been successfully submitted and would be delivered to recipients shortly.