Loading ...
Sorry, an error occurred while loading the content.

Removing/Not removing duplicate lines

Expand Messages
  • John Fitzsimons
    I was wanting to remove duplicate lines in a text file the other day and couldn t find anything in a menu to enable that. There was nothing in the help file
    Message 1 of 4 , Jul 1, 2003
    • 0 Attachment
      I was wanting to remove duplicate lines in a text file the other day
      and couldn't find anything in a menu to enable that. There was nothing
      in the help file under "duplicate". :-(

      Rather than spend ages trying to work out how to get a script to do it
      I looked at EditPad. It has a neat search and replace method that does
      this i.e.


      To remove duplicate lines :

      The "Search Text" box should look like this:

      ^(.*)(
      \1)+$

      "Replace Text" box key in \1



      To remove non duplicate lines :

      The "Search Text" box should look like this:

      ^(.+)
      (?!\1$)

      "Replace Text" box is completely empty.


      These don't seem to work with NoteTab. :-(

      Even though I have since found out that duplicate removal is well
      hidden in one of the NoteTab options I would still be interested to
      know whether BOTH of the above can be done via "search and
      replace", and if so what the syntax for each would be.

      Can anyone help please ?


      Regards, John.

      --
      ****************************************************
      ,-._|\ John Fitzsimons - Melbourne, Australia.
      / Oz \ http://www.vicnet.net.au/~johnf/welcome.htm
      \_,--.x/ http://clients.net2000.com.au/~johnf/
      v
    • hsavage
      ... John, My first question would be, Did you check the Regexp box in the Replace dialog? If so, being a novice, I can t help much with Regexp. If you re
      Message 2 of 4 , Jul 1, 2003
      • 0 Attachment
        John Fitzsimons wrote:

        > I was wanting to remove duplicate lines in a text file the other day
        > and couldn't find anything in a menu to enable that. There was nothing
        > in the help file under "duplicate". :-(
        >
        > To remove duplicate lines :
        >
        > The "Search Text" box should look like this:
        >
        > ^(.*)(
        > \1)+$
        >
        > "Replace Text" box key in \1
        >
        > To remove non duplicate lines :
        >
        > The "Search Text" box should look like this:
        >
        > ^(.+)
        > (?!\1$)
        >
        > "Replace Text" box is completely empty.
        >
        > These don't seem to work with NoteTab. :-(
        >
        > Even though I have since found out that duplicate removal is well
        > hidden in one of the NoteTab options I would still be interested to
        > know whether BOTH of the above can be done via "search and
        > replace", and if so what the syntax for each would be.
        >
        > Can anyone help please ?
        >
        > Regards, John.

        John,

        My first question would be, Did you check the Regexp box in the Replace
        dialog? If so, being a novice, I can't help much with Regexp.

        If you're referring to NoteTab Pro, you must remember, a blank line is a
        'crlf. That being the case you will have 2 'crlf's in sequence. All
        you need do is search for 2 'crlf's and replace with 1 'crlf'. 'crlf'
        being a carriage return-linefeed.

        Using the Replace dialog and checking the regexp box you could use;

        in the search side use regexp
        \n\n

        in the replace side use regex[
        \n

        To do the same without regexp using NTP standard characters;

        search side
        ^p^p

        replace side
        ^p

        OR

        search side
        ^%nl%^%nl%

        replace side
        ^%nl%

        Each of these on the search side represent double 'crlf's.
        Each of these on the replace side represent single 'crlf'.

        hrs
      • hsavage
        ... John, I did read your first email but, instead of duplicate I read blank . Most of my reply is useless for this reason. The only thing that may help is
        Message 3 of 4 , Jul 1, 2003
        • 0 Attachment
          hsavage wrote:

          > John Fitzsimons wrote:
          >
          >> I was wanting to remove duplicate lines in a text file the other day
          >>
          >> Even though I have since found out that duplicate removal is well
          >> hidden in one of the NoteTab options I would still be interested to
          >> know whether BOTH of the above can be done via "search and
          >> replace", and if so what the syntax for each would be.
          >>
          >> Can anyone help please ?
          >>
          >> Regards, John.
          >
          > John,
          >
          > My first question would be, Did you check the Regexp box in the Replace
          > dialog? If so, being a novice, I can't help much with Regexp.
          >
          > If you're referring to NoteTab Pro, you must remember, a blank line is a
          > 'crlf. That being the case you will have 2 'crlf's in sequence. All
          > you need do is search for 2 'crlf's and replace with 1 'crlf'. 'crlf'
          > being a carriage return-linefeed.
          >
          >
          > hrs

          John,

          I did read your first email but, instead of 'duplicate' I read 'blank'.
          Most of my reply is useless for this reason.

          The only thing that may help is knowing that, with WordWrap Off, every
          line has a 'crlf'. Maybe in your regexp code you need to make
          allowances for that.

          hrs
        • Alec Burgess
          Hi John: Find n n+ Replace n (no quotes) with regexp ticked will remove any number of consecutive blank lines. Note: where you are using (I think) () s
          Message 4 of 4 , Jul 2, 2003
          • 0 Attachment
            Hi John:

            Find "\n\n+" Replace "\n" (no quotes) with regexp ticked will remove any
            number of consecutive blank lines.

            Note: where you are using (I think) ()'s and "$1", "$2" for replacement
            arguments in EditPad, Notetab uses {}'s and \1, \2 etc.

            Notetabs regexp engine is fairly powerful but can get confused occasionally
            and may take a long time for large files and/or complicated find/replace
            sequences.

            Its chief shortcoming is total lack of non-greedy regexp.

            If you're familiar with Perl its fairly easy to call Perl scripts from
            Notetab clips. (There are a few samples in the Clip samples (where
            else:-) ).

            I think you'll find the regexp syntax in similar (identical(?)) to that in
            EditPad.

            If you've got more questions you should probably join/post on the clips
            list: http://groups.yahoo.com/group/ntb-clips/

            Regards ... Alec
            --

            ---- Original Message ----
            From: "John Fitzsimons" <johnf@...>
            To: <notetab@yahoogroups.com>
            Sent: Tuesday, July 01, 2003 23:29
            Subject: [NTB] Removing/Not removing duplicate lines

            > I was wanting to remove duplicate lines in a text file the other day
            > and couldn't find anything in a menu to enable that. There was nothing
            > in the help file under "duplicate". :-(
            >
            > Rather than spend ages trying to work out how to get a script to do it
            > I looked at EditPad. It has a neat search and replace method that does
            > this i.e.
            >
            >
            > To remove duplicate lines :
            >
            > The "Search Text" box should look like this:
            >
            > ^(.*)(
            > \1)+$
            >
            > "Replace Text" box key in \1
            >
            >
            >
            > To remove non duplicate lines :
            >
            > The "Search Text" box should look like this:
            >
            > ^(.+)
            > (?!\1$)
            >
            > "Replace Text" box is completely empty.
            >
            >
            > These don't seem to work with NoteTab. :-(
            >
            > Even though I have since found out that duplicate removal is well
            > hidden in one of the NoteTab options I would still be interested to
            > know whether BOTH of the above can be done via "search and
            > replace", and if so what the syntax for each would be.
            >
            > Can anyone help please ?
            >
            >
            > Regards, John.
          Your message has been successfully submitted and would be delivered to recipients shortly.