Loading ...
Sorry, an error occurred while loading the content.

How to delete newlines

Expand Messages
  • Trey Beck
    Hi, I am a little new at this. I have a text file that consists of multiple values surrounded by quotes and delimited by commas. For example: [Ex. 1] string
    Message 1 of 9 , Nov 15, 2001
    • 0 Attachment
      Hi, I am a little new at this. I have a text file that consists of multiple
      values surrounded by quotes and delimited by commas. For example:

      [Ex. 1]
      "string 1","string 2","string 3","string 4"
      "string 1","string 2","string 3","string 4"
      "string 1","string 2","string 3","string 4"

      In some cases, there are hard returns within the strings, so that the
      following occurs:

      [Ex. 2]
      "string 1","string 2","string
      3","string 4"
      "string 1","string 2","string 3","string 4"

      [1] I'm wondering how I could delete these newlines to get the list back to
      one record per line (as in ex. 1).
      [2] Finally, some of the strings have double quotes within them, eg:

      [Ex. 3]
      "string 1","string "double quotes were inserted here" 2","string 3","string
      4"

      Is there a way to remove those interior quotes? The file has 4089 lines.

      Thanks so much.




      _________________________________________________________________
      Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp
    • Nicole Simon
      ... A .csv File :o) Please check first: Are you sure, that each real line ends with a ^p
      Message 2 of 9 , Nov 16, 2001
      • 0 Attachment
        "Trey Beck" <cmichaelbeck@...> wrote:
        >"string 1","string 2","string 3","string 4"


        A .csv File :o)

        Please check first: Are you sure, that each real line ends with a > "^p <?
        In case yes, every line to work with does not end so.

        You could then [pseudocode, I should be at work now ;o)]
        - replace every occourance of "^p with something unique
        - replace every remaining ^p with nothing
        - rereplace: something unique with "^p


        hth
        Nicole


        --
        »What's this for?« Dust Puppy (being hugged by A.J.)
        »It's for all of the people who can't do this anymore.« A.J. (crying)
        In Memory of September 11, 2001 - Please take a moment for those who have
        none left. (http://ars.userfriendly.org/cartoons/?id=20010912)
      • Hugo
        Hello, This sort of problem came up recently on this list: Messages 7691, 7693 and 7698, subject Delimiter and commas (off topic) deal with handling of
        Message 3 of 9 , Nov 16, 2001
        • 0 Attachment
          Hello,

          This sort of problem came up recently on this list:

          Messages 7691, 7693 and 7698, subject Delimiter and commas (off
          topic) deal with handling of .csv-files. There were some other
          answers about how to remove the interior quotes (I think), but I
          didn't follow it that close - I cannot tell you if the results were
          satisfying.

          You can go to these messages directly by going to
          groups.yahoo.com/group/ntb-clips/message/7691, etc.

          Maybe it helps if you check these first. And of course, you can
          always write a clip to do what you want...

          HTH,

          Hugo


          > "string 1","string 2","string 3","string 4"
          > "string 1","string 2","string 3","string 4"
          > "string 1","string 2","string 3","string 4"
          >
        • cmichaelbeck@hotmail.com
          great! thanks for the info. i ended up joining lines by hand, but then replacing , with | and ^P with ^P to take care of the beginning and ending quotes.
          Message 4 of 9 , Nov 16, 2001
          • 0 Attachment
            great! thanks for the info. i ended up joining lines by hand, but
            then replacing "," with | and "^P" with ^P to take care of the
            beginning and ending quotes.

            question: is there a reg exp that i could have used to find ^P but
            not "^P"? for example, something that would find the second newline
            but not the first:

            "String 1","String 2","String 3"^P
            "String ^P
            1","String 2","String 3"

            thanks,
            trey


            --- In ntb-clips@y..., "Hugo" <h.paulissen@f...> wrote:
            > Hello,
            >
            > This sort of problem came up recently on this list:
            >
            > Messages 7691, 7693 and 7698, subject Delimiter and commas (off
            > topic) deal with handling of .csv-files. There were some other
            > answers about how to remove the interior quotes (I think), but I
            > didn't follow it that close - I cannot tell you if the results were
            > satisfying.
            >
            > You can go to these messages directly by going to
            > groups.yahoo.com/group/ntb-clips/message/7691, etc.
            >
            > Maybe it helps if you check these first. And of course, you can
            > always write a clip to do what you want...
            >
            > HTH,
            >
            > Hugo
            >
            >
            > > "string 1","string 2","string 3","string 4"
            > > "string 1","string 2","string 3","string 4"
            > > "string 1","string 2","string 3","string 4"
            > >
          • Alan C.
            Hi trey, ... H= regex test ^!Find n[^ ] RS ;---
            Message 5 of 9 , Nov 16, 2001
            • 0 Attachment
              Hi trey,

              >question: is there a reg exp that i could have used to find ^P but
              >not "^P"? for example, something that would find the second newline
              >but not the first:
              >
              >"String 1","String 2","String 3"^P
              >"String ^P
              >1","String 2","String 3"

              H="regex test"
              ^!Find "\n[^"]" RS
              ;---<end of clip<<

              Click cursor at top of doc first then run above teeny little clip: it found
              for your search criteria.

              I'm a newbie on regex though. So be sure test it on a sample or a copy
              before using on the real mccoy file!
              -----------------------

              What follows taken from NTP Help regex topic under "Find Patterns" nearly
              the 8th paragraph down:

              A string enclosed in brackets [] specifies a character class. Any single
              character in the string is matched. For example, [abc] matches an a, b, or
              c. Ranges of ASCII letters and numbers can be abbreviated as, for example,
              [a-z0-9]. If the first symbol following the [ is a caret (^) then a
              negative character class is specified. In this case, the string matches all
              characters EXCEPT those enclosed in the brackets. For example, [^a-z]
              matches everything except lower case characters (and newlines).

              Regards. Alan.

              It's fun to learn new things.
            • Alec Burgess
              ... ^!replace {[^ ]} n{[^ ]} 1 2 RWAS in words: any char not a quote followed by new-line followed by any character not a quote: replace by the
              Message 6 of 9 , Nov 16, 2001
              • 0 Attachment
                > question: is there a reg exp that i could have used to find ^P but
                > not "^P"? for example, something that would find the second newline
                > but not the first:
                >
                > "String 1","String 2","String 3"^P
                > "String ^P
                > 1","String 2","String 3"

                ^!replace "{[^\"]}\n{[^\"]}" >> "\1\2" RWAS
                in words: any char not a quote followed by new-line followed by any
                character not a quote: replace by the first char followed by the second. (in
                regexp "\n" is the same as "^p" in non-regexp (=<CR>+<LF>)

                Regards ... Alec
                ----- Original Message -----
                From: <cmichaelbeck@...>
                To: <ntb-clips@yahoogroups.com>
                Sent: 16 November, 2001 11:53
                Subject: [Clip] Re: How to delete newlines
              • Alan C.
                Hi Alec, ... Your clip worked great on the sample as per what it was designed to accomplish. But I accidentally discovered that: On repetitive spaces, each on
                Message 7 of 9 , Nov 16, 2001
                • 0 Attachment
                  Hi Alec,

                  >^!replace "{[^\"]}\n{[^\"]}" >> "\1\2" RWAS
                  >in words: any char not a quote followed by new-line followed by any
                  >character not a quote: replace by the first char followed by the second. (in
                  >regexp "\n" is the same as "^p" in non-regexp (=<CR>+<LF>)

                  Your clip worked great on the sample as per what it was designed to
                  accomplish. But I accidentally discovered that:

                  On repetitive spaces, each on a line by itself, your clip didn't do what,
                  in theory, it is supposed to do:

                  In the following, I use a * to take the place of a space created by the
                  keyboard's space bar.

                  "String 1","String 2","String 3"
                  "String
                  1","String 2","String 3"
                  jjj
                  lll*
                  *
                  *
                  *
                  *

                  I (accidentally) discovered that your clip doesn't place all those spaces
                  (each represented by a *) next to each other. In theory, each is a
                  character on a line by itself and the newline in between each should go
                  away. But it doesn't.

                  I'm using NTP 486C pre release 3.

                  Dunno if it warrants a post to the NRN beta testing email list or not.

                  But it happened so I'll post over there too.

                  Regex things I learned (may ignore this paragraph):
                  As far as the clip goes: I knew everything except the curly braces. A look
                  in help showed them to be tagged matches. So (in this case, each tagged
                  match is for a character) there's a tagged match number one and a tagged
                  match number two. The "\1\2" makes it place char one immediately next to
                  char two, effectively eliminating or discarding the ^p that formerly had
                  been in between the char one and the char two.
                  ----------------------------

                  Thanks. Alan.
                • Alan C.
                  ... Evidently a beginning character cannot also serve as an ending character on those lines. ok I just now tried it with sample like thus: String 1 , String
                  Message 8 of 9 , Nov 16, 2001
                  • 0 Attachment
                    >On repetitive spaces, each on a line by itself, your clip didn't do what,
                    >in theory, it is supposed to do:
                    >
                    >In the following, I use a * to take the place of a space created by the
                    >keyboard's space bar.
                    >
                    >"String 1","String 2","String 3"
                    >"String
                    >1","String 2","String 3"
                    >jjj
                    >lll*
                    >*
                    >*
                    >*
                    >*

                    Evidently a beginning character cannot also serve as an ending character on
                    those lines.
                    ok I just now tried it with sample like thus:

                    "String 1","String 2","String 3"
                    "String
                    1","String 2","String 3"
                    jjj
                    lll*
                    **
                    **
                    **
                    **

                    And the clip did properly.

                    So, evidently there's a given that the beginning and ending character on
                    those lines cannot be the same character. In other words, each line must
                    contain a minimum of two characters in order for the clip to work properly.

                    Alan.
                  • Alec Burgess
                    Allan: ... (in ... My guess is it s not really a bug because with only one character on a line, the single character of the next line has already been used
                    Message 9 of 9 , Nov 16, 2001
                    • 0 Attachment
                      Allan:
                      re:
                      > >^!replace "{[^\"]}\n{[^\"]}" >> "\1\2" RWAS
                      > >in words: any char not a quote followed by new-line followed by any
                      > >character not a quote: replace by the first char followed by the second.
                      (in
                      > >regexp "\n" is the same as "^p" in non-regexp (=<CR>+<LF>)

                      Your "counter-example":
                      > "String 1","String 2","String 3"
                      > "String
                      > 1","String 2","String 3"
                      > jjj
                      > lll*
                      > *
                      > *
                      > *
                      > *

                      My guess is it's not "really" a bug because with only one character on a
                      line, the single character of the next line has already been "used" up when
                      its put back on the preceding line, hence doesn't get considered for
                      matching on the next "cycle".
                      THE MOVING FINGER, HAVING WRIT MOVES ON .... ;==)

                      I put the 26 letters of the alphabet 1 per line, and executed the above
                      clip. It converted to 13 lines, 2 letters per line. Executing the clip on
                      THAT, got rid of ALL the linebreaks and gave me ONE line with all 26 letters
                      on it.

                      However changing the clip to:
                      ^!replace "{[^\"]}\n" >> "\1" RWAS
                      DOES do it all in one pass.

                      Don'tcha just love regexps!

                      Regards ... Alec
                      ----- Original Message -----
                      From: "Alan C." <acummings@...>
                      To: <ntb-clips@yahoogroups.com>
                      Sent: 16 November, 2001 22:13
                      Subject: Re: [Clip] Re: How to delete newlines
                    Your message has been successfully submitted and would be delivered to recipients shortly.