Loading ...
Sorry, an error occurred while loading the content.

Re: New Guy Wants Help :)

Expand Messages
  • John Zeman
    ... 100 MB of ... hoping there ... files for ... that was ... regularly ... screen to ... _______________________ For your first task, if there is only one
    Message 1 of 2 , Apr 10, 2005
    • 0 Attachment
      --- In notetab@yahoogroups.com, "Ralph Ausmann" <ralph.ausmann@v...>
      wrote:
      >
      > I've been doing this manually using my NoteTab Pro but have over
      100 MB of
      > text to do and I'm not getting paid by the hour for it. So, am
      hoping there
      > is an easier way. I'm trying to cleanup our old mail list text
      files for
      > better use by a keyword retrieval program. It's pretty messy text
      that was
      > generated by thousands of list subscribers (non-moderated)
      > -------------
      >
      > What I want to do is delete variable lengths of text between two
      regularly
      > occurring text strings. The text to delete may be from about one
      screen to
      > 3 or 4 screens full of text. As follows:
      >
      > DELETE ALL TEXT BEGINNING AND ENDING WITH:
      >
      > Starts with -> "End of AMC-LIST Digest"
      >
      > >---to---> (variable miscellaneous text)
      >
      > Ends with -> "----------------------------|\"
      >
      >
      > This line and delimiter must be replaced to the
      > text file -> "----------------------------|\"
      >
      > I need the correct search/replace expression for this task.
      >
      > __________________________________________________________________
      >
      > Also, I want to delete the following (example) lines:
      >
      > TO DELETE THE FOLLOWING LINES:
      >
      > (always starts with "id " and the rest is variable)
      > "id AA20273; Tue, 18 Apr 95 11:24:44 -0400"
      >
      > and
      >
      > (always starts with "Message-Id: " and the rest is variable)
      > "Message-Id: <H000029a0012205f@MHS>"
      >
      > or
      >
      > (always starts with "Message-Id: " and the rest is variable)
      > "Message-Id: <Pine.SUN.3.91.950418183650.3140A-100000@u...>"
      >
      > ________________________________________________
      _______________________


      For your first task, if there is only one occurrence of


      "End of AMC-LIST Digest"
      >---To---> (variable miscellaneous text)
      Ends with -> "----------------------------|\"


      per file, then you could do it with the standard toolbar replace tool
      by pasting the below code into it and then ticking on the regular
      expression option:

      Replace
      End of AMC-LIST Digest\a*----------------------------|\

      With
      (Nothing - leave blank)

      However if there is more than one occurrence of the above text per
      file, this will not work as the regexp will replace everything
      between the very first

      End of AMC-LIST Digest

      to the very last

      ----------------------------|\

      In that situation I would use a clip which is beyond the scope of
      this basic forum. You might want to sign up for the clips list if
      you're not already a member and ask the question there.

      Likewise for your second task, assuming your examples are on lines by
      themselves, the following RegExp will find all lines that begin with
      id

      ^id.*

      while this RegExp will find all lines that begin with Message-Id

      ^Message-Id:.*

      And again all this could be done with or without regular expressions
      via clips.

      John
    Your message has been successfully submitted and would be delivered to recipients shortly.