Loading ...
Sorry, an error occurred while loading the content.

RE: [Clip] What command for Count Occurences?

Expand Messages
  • Martyn Folkes
    This should do what you want (it is 3 lines): ^!Set %string%=^?{String to count}; %filename%=^?{(T=O)Filename} ^!Set
    Message 1 of 4 , Mar 1 4:14 AM
    • 0 Attachment
      This should do what you want (it is 3 lines):

      ^!Set %string%=^?{String to count}; %filename%=^?{(T=O)Filename}
      ^!Set %count%=^$StrCount("^%string%";"^$GetFileText("^%filename%")$";N;N)$
      ^!Prompt There are ^%count% occurences of your search string.

      If you only want to spit the file into 2, it may be easiest to open the file
      and cut and paste half of it into a new file.

      Martyn


      > -----Original Message-----
      > From: bobbit_singapore [mailto:dick.gascoigne@...]
      > Sent: 01 March 2002 07:25
      > To: ntb-clips@yahoogroups.com
      > Subject: [Clip] What command for Count Occurences?
      >
      >
      > Using NoteTab Pro 4.86d:
      >
      > I have fairly large files (30 - 100 MB; 500K - 1,500K lines), which
      > ontain multiple occurrences of a string. I would like to know the
      > number of occurrences, using a clip. The Replace command does not
      > have a "Count Occurrences" option. What command(s) should I use?
      >
      > BTW:
      > The String I'm searchin for represents the header line which starts a
      > new document, of which there are several thousand in a file. My end
      > objective is to divide the file into 'N' files, each with an
      > approximately equal number of documents. IE: if the count is 8,200
      > documents, and 'N' = 2, I want to wind up with two files of 4,100
      > documents each. Is there a best way in NTB to do this, or should I
      > be looking at another tool?
      >
      > Best Regards,
      >
      > Dick Gascoigne
      > dick.gascoigne@...
      > --
      > Appic (S) Pte Ltd
      > 74A Amoy Street; Singapore 069893
      > Tel: (+65) 225-9908 Fax: (+65) 225-9092
      > Email: dick.gascoigne@...
      > Web: www.appic.com
      >
      >
      >
      >
      >
      >
      >
      >
      > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      >
      >
      >
    • Dick Gascoigne
      Thanks to all who responded -- all the ideas were good ! Results feedback follows: The purpose is to divide a huge spool file of telephone bills into multiple
      Message 2 of 4 , Mar 3 8:21 AM
      • 0 Attachment
        Thanks to all who responded -- all the ideas were good !

        Results feedback follows:

        The purpose is to divide a huge spool file of telephone bills into multiple
        files to be printed in parallel on multiple printers. Each bill starts with
        a standard header line.

        One of the controlling factors is that the file has to be split on a
        document boundary, but the files don't have to be of exactly equal number of
        documents.

        Method:

        I first made a clip combining the techniques: The Martyn-Tyrell method for
        counting the number of documents,
        and then splitting the file with the "Hugo technique" of using Find to count
        up to some number of documents, then Selecting those lines, Appending to a
        file, and then cutting them out and again count/Select/Append/Cut until the
        file is all split.

        For a file of 8,200 documents, and 500,000 lines, the whole process took 50
        min on a 700MHz, 128K box. Not fast enough for production use. Most of the
        time seemed to be the repetitive Finds, and particularly the Cut (deleting
        some 150,000 lines at a whack).

        Solution: I load the file, Get the total linecount (not document count),
        and divide by the number of files I want to split it into, yielding
        something I call SegmentSize.

        Then I jump to line number SegmentSize, and search forward for the end of
        the document I landed in, getting EndLine. Then Select from StartLine
        (initially 1) to EndLine, and Append the Selection to create a file.

        Then I reset StartLine to be EndLine, Jump forward SegmentSize lines,
        Search, Select, Append, ... repeating until done.

        The Search only ever has to go to the next document start, and I never Cut.
        Time now: 56 seconds to split the 8,200 bill file into four files !!!

        Beautiful ! Thank you all.

        I'll be pleased to post the clip or send it privately if anyone would find
        it useful.

        (And yes, the weather is better here than in Europe -- always 24 - 34 deg.
        C, but 80% humidity)
        (But the skiing sucks!)

        Best Regards,

        Dick Gascoigne
        --
        Appic (S) Pte Ltd
        74A Amoy Street; Singapore 069893
        Tel: (+65) 6225-9908 Fax: (+65)6 225-9092
        Email: dick.gascoigne@...
        Web: www.appic.com

        >
        > > I have fairly large files (30 - 100 MB; 500K - 1,500K lines),
        > which
        > > contain multiple occurrences of a string. I would like to know the
        > > number of occurrences, using a clip. The Replace command does not
        > > have a "Count Occurrences" option. What command(s) should I use?
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.