Loading ...
Sorry, an error occurred while loading the content.

Re: [Clip] Create CSV file

Expand Messages
  • buralex@gmail.com
    Don - HtmlFixIt.com said on Dec 06, 2007 7:14 ... Alec not Alex :-) ... ^!replace ( r n)+ r n rwais or ^!replace R+ r n
    Message 1 of 17 , Dec 6, 2007
    • 0 Attachment
      "Don - HtmlFixIt.com" <don@...> said on Dec 06, 2007 7:14
      -0500 (in part):
      > good start Alex ...
      Alec not Alex :-)
      > ; remove all empty lines
      > ; how to do with regex? anyone?
      > :Loop
      > ^!Replace "^P^P" >> "" ACIWS
      > ^!IfError Next ELSE Loop
      ^!replace "(\r\n)+" >> "\r\n" rwais

      or

      ^!replace "\R+" >> "\r\n" rwais

      ie. change one or more occurrences of <RETURN> to just one on entire
      document. Your "^P^P" requires extra passes based on whether there is
      odd or even numbers of extra blank lines. Since one by itself on the end
      of the line with non-blank on next line is OK it *might* be faster to
      use: "(\r\n){2,}" or "\R{2,} but Sheri is the only one who might be able
      to answer that authoritatively :-)

      btw: (someone else asked about use of flags in
      replace/find/regex/non-regex)
      I always include "i"-case insensitive flag even when not needed as in
      above two lines.
      And *almost* always "w"-whole doc, not just from cursor, "a"-all
      occurrences in line not just first

      btw: Mike - does it make any difference wrt. your end use of the
      converted spreadsheet whether numeric fields are un-quoted or not, or
      whether or not extra trailing null fields (TABS) get converted to
      unnecessary sets of (,"","","")
      My perhaps erroneous belief had always been that CSV requires
      surrounding quote marks only when a field actually contains interior
      comma(s).
      Isn't:

      * bob,alec, dave jones, bill is just as valid as
      * "bob","alec", "dave jones","bill"?

      Regards ... Alec -- buralex-gmail
      --



      [Non-text portions of this message have been removed]
    • Sheri
      ... It would be more efficient IMO to replace two or more with one than to replace one or more with one. It would be doing unnecessary work if you match and
      Message 2 of 17 , Dec 7, 2007
      • 0 Attachment
        --- In ntb-clips@yahoogroups.com, buralex@... wrote:
        >
        > "Don - HtmlFixIt.com" <don@...> said on Dec 06, 2007 7:14
        > -0500 (in part):
        > > good start Alex ...
        > Alec not Alex :-)
        > > ; remove all empty lines
        > > ; how to do with regex? anyone?
        > > :Loop
        > > ^!Replace "^P^P" >> "" ACIWS
        > > ^!IfError Next ELSE Loop
        > ^!replace "(\r\n)+" >> "\r\n" rwais
        >
        > or
        >
        > ^!replace "\R+" >> "\r\n" rwais
        >
        > ie. change one or more occurrences of <RETURN> to just one on
        > entire document. Your "^P^P" requires extra passes based on
        > whether there is odd or even numbers of extra blank lines. Since
        > one by itself on the end of the line with non-blank on next line
        > is OK it *might* be faster to use: "(\r\n){2,}" or "\R{2,} but
        > Sheri is the only one who might be able to answer that
        > authoritatively :-)

        It would be more efficient IMO to replace two or more with one than to
        replace one or more with one. It would be doing unnecessary work if
        you match and replace the single occurrences. However the time
        difference would probably not be noticeable.

        Regards,
        Sheri
      • Don - HtmlFixIt.com
        ... Actually I tried this very combination before writing the above. It does not work! If you have a blank line as the last line in the file, it is not
        Message 3 of 17 , Dec 7, 2007
        • 0 Attachment
          Sheri wrote:
          > --- In ntb-clips@yahoogroups.com, buralex@... wrote:
          >> "Don - HtmlFixIt.com" <don@...> said on Dec 06, 2007 7:14
          >> -0500 (in part):
          >>> good start Alex ...
          >> Alec not Alex :-)
          >>> ; remove all empty lines
          >>> ; how to do with regex? anyone?
          >>> :Loop
          >>> ^!Replace "^P^P" >> "" ACIWS
          >>> ^!IfError Next ELSE Loop


          >> ^!replace "(\r\n)+" >> "\r\n" rwais
          >>

          Actually I tried this very combination before writing the above. It
          does not work! If you have a blank line as the last line in the file,
          it is not removed. Even though it shows two paragraphs symbols in a
          row. Mine doesn't work either (the ^P^P) will not remove the last blank
          line. Perhaps this is a bug?

          I sometimes use this at the end of the file:
          :ConfirmNoBlankLine
          ^!Jump Doc_End
          ^!Select Bol
          ^!If "^$GetSelection$" <> "" DoIt
          ^!Keyboard BACKSPACE
        • Flo
          ... Don, The A and z sequences are doing the job. The possible positions of CR/NL are: - at doc start matched with A - a CR/NL followed by CR/NL - at doc
          Message 4 of 17 , Dec 8, 2007
          • 0 Attachment
            --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...>
            wrote:

            > ^!replace "(\r\n)+" >> "\r\n" rwais
            > Actually I tried this very combination before writing the above.
            > It does not work! If you have a blank line as the last line in
            > the file, it is not removed...

            Don,

            The \A and \z sequences are doing the job. The possible positions of
            CR/NL are:

            - at doc start matched with \A
            - a CR/NL followed by CR/NL
            - at doc end matched with \z

            This will remove double CR/NL at any position:

            ^!Replace "\R(?=\R)|\A\R|\R\z" >> "" AWRS

            Regards,
            Flo
          • Sheri
            ... The end-of-file marker is not a line break. A line break is a series of actual control characters (carriage return and line feed). They can actually be
            Message 5 of 17 , Dec 8, 2007
            • 0 Attachment
              Flo wrote:
              > --- In ntb-clips@yahoogroups.com, "Don - HtmlFixIt.com" <don@...>
              > wrote:
              >
              >
              >> ^!replace "(\r\n)+" >> "\r\n" rwais
              >> Actually I tried this very combination before writing the above.
              >> It does not work! If you have a blank line as the last line in
              >> the file, it is not removed...
              >>
              >
              > Don,
              >
              > The \A and \z sequences are doing the job. The possible positions of
              > CR/NL are:
              >
              > - at doc start matched with \A
              > - a CR/NL followed by CR/NL
              > - at doc end matched with \z
              >
              > This will remove double CR/NL at any position:
              >
              > ^!Replace "\R(?=\R)|\A\R|\R\z" >> "" AWRS
              >
              > Regards,
              > Flo
              >
              >

              The end-of-file marker is not a line break. A line break is a series of
              actual control characters (carriage return and line feed). They can
              actually be selected, copied, pasted, etc. NoteTab Pro has a feature to
              display hidden characters, but it makes line breaks appear to be one
              character when it is actually two.

              It is usually best if each line in the document is terminated with a
              carriage return/line feed. That means there will hopefully be no text on
              the "line" containing the end-of-file marker.

              When the last line of text is followed immediately by the end-of-file
              marker, clips to process that file often require special processing just
              for the last line.

              Regards,
              Sheri
            • Don - HtmlFixIt.com
              Ok why does a guy named Alec have Alex in his email address? Anyway, you are correct, the quotes are only needed in comma separated value files when/if there
              Message 6 of 17 , Dec 8, 2007
              • 0 Attachment
                Ok why does a guy named Alec have Alex in his email address? Anyway, you
                are correct, the quotes are only needed in comma separated value files
                when/if there are delimiters (usually commas) in the data itself.
                However having extra quotes causes no problems.

                > btw: Mike - does it make any difference wrt. your end use of the
                > converted spreadsheet whether numeric fields are un-quoted or not, or
                > whether or not extra trailing null fields (TABS) get converted to
                > unnecessary sets of (,"","","")
                > My perhaps erroneous belief had always been that CSV requires
                > surrounding quote marks only when a field actually contains interior
                > comma(s).
                > Isn't:
                >
                > * bob,alec, dave jones, bill is just as valid as
                > * "bob","alec", "dave jones","bill"?
                >
                > Regards ... Alec -- buralex-gmail
              • Don - HtmlFixIt.com
                ... Thanks Sheri and Flo (and AleC) for moving this discussion along. Flo that removes all empty lines as promised. That is one I need to save. Perhaps,
                Message 7 of 17 , Dec 8, 2007
                • 0 Attachment
                  Flo wrote:
                  >> - at doc start matched with \A
                  >> - a CR/NL followed by CR/NL
                  >> - at doc end matched with \z
                  >>
                  >> This will remove double CR/NL at any position:
                  >>
                  >> ^!Replace "\R(?=\R)|\A\R|\R\z" >> "" AWRS
                  >>
                  >> Regards,
                  >> Flo
                  >>
                  >>
                  > Sheri wrote:
                  > The end-of-file marker is not a line break. A line break is a series of
                  > actual control characters (carriage return and line feed). They can
                  > actually be selected, copied, pasted, etc. NoteTab Pro has a feature to
                  > display hidden characters, but it makes line breaks appear to be one
                  > character when it is actually two.
                  >
                  > It is usually best if each line in the document is terminated with a
                  > carriage return/line feed. That means there will hopefully be no text on
                  > the "line" containing the end-of-file marker.
                  >
                  > When the last line of text is followed immediately by the end-of-file
                  > marker, clips to process that file often require special processing just
                  > for the last line.
                  >
                  > Regards,
                  > Sheri
                  >

                  Thanks Sheri and Flo (and AleC) for moving this discussion along.

                  Flo that removes all "empty lines" as promised. That is one I need to
                  save. Perhaps, when showing hidden characters, notetab should really
                  reflect the last "return" as a file end mark of some type to distinguish
                  it from another return? I'll be honest I have never thought about it
                  because I am only now beginning to understand and use regex. I always
                  did it manually. But it all makes sense now, that last hidden character
                  is really a file end (the \z) and returns are really two characters the
                  (\r\n) as I caught on to a while back when I started using regex. I
                  guess I have always known there was a file end but never gave it a
                  second thought. In our context the issue is whether the file end is on
                  the last line or on the next line then if I have it right now.

                  Don

                  saved here: http://htmlfixit.com/blog/?p=361
                • Sheri
                  ... Hi Don, Not to nit pick with your blog, but the end-of-file marker is not a character. z is an assertion for the position of end of the file. It has a
                  Message 8 of 17 , Dec 8, 2007
                  • 0 Attachment
                    Don - HtmlFixIt.com wrote:
                    > Flo wrote:
                    >
                    >>> - at doc start matched with \A
                    >>> - a CR/NL followed by CR/NL
                    >>> - at doc end matched with \z
                    >>>
                    >>> This will remove double CR/NL at any position:
                    >>>
                    >>> ^!Replace "\R(?=\R)|\A\R|\R\z" >> "" AWRS
                    >>>
                    >>> Regards,
                    >>> Flo
                    >>>
                    >>>
                    >>>
                    >> Sheri wrote:
                    >> The end-of-file marker is not a line break. A line break is a series of
                    >> actual control characters (carriage return and line feed). They can
                    >> actually be selected, copied, pasted, etc. NoteTab Pro has a feature to
                    >> display hidden characters, but it makes line breaks appear to be one
                    >> character when it is actually two.
                    >>
                    >> It is usually best if each line in the document is terminated with a
                    >> carriage return/line feed. That means there will hopefully be no text on
                    >> the "line" containing the end-of-file marker.
                    >>
                    >> When the last line of text is followed immediately by the end-of-file
                    >> marker, clips to process that file often require special processing just
                    >> for the last line.
                    >>
                    >> Regards,
                    >> Sheri
                    >>
                    >>
                    >
                    > Thanks Sheri and Flo (and AleC) for moving this discussion along.
                    >
                    > Flo that removes all "empty lines" as promised. That is one I need to
                    > save. Perhaps, when showing hidden characters, notetab should really
                    > reflect the last "return" as a file end mark of some type to distinguish
                    > it from another return? I'll be honest I have never thought about it
                    > because I am only now beginning to understand and use regex. I always
                    > did it manually. But it all makes sense now, that last hidden character
                    > is really a file end (the \z) and returns are really two characters the
                    > (\r\n) as I caught on to a while back when I started using regex. I
                    > guess I have always known there was a file end but never gave it a
                    > second thought. In our context the issue is whether the file end is on
                    > the last line or on the next line then if I have it right now.
                    >
                    > Don
                    >
                    > saved here: http://htmlfixit.com/blog/?p=361
                    >
                    >
                    Hi Don,

                    Not to nit pick with your blog, but the end-of-file marker is not a
                    character. \z is an assertion for the position of end of the file. It
                    has a width of zero characters. Ditto for \A, \Z, \z, ^ and $.

                    Regards,
                    Sheri
                  • buralex@gmail.com
                    Don - HtmlFixIt.com said on Dec 08, 2007 10:39 ... My given name is Alexander but my parents always called me Alec after my uncle and
                    Message 9 of 17 , Dec 8, 2007
                    • 0 Attachment
                      "Don - HtmlFixIt.com" <don@...> said on Dec 08, 2007 10:39
                      -0500 (in part):
                      > Ok why does a guy named Alec have Alex in his email address?
                      My given name is Alexander but my parents always called me Alec after my
                      uncle and grandfather.
                      More recently ... I have an email filter that puts any email I receive
                      that contains the string "alec" in to a "Look-at-me first" folder. So I
                      stick the alex in my email address as BURgess+ALEX. If I'd used BURALEC
                      then all the newsletters that contain stuff like : "you subscribed a
                      buralex@..." would be mistakenly filtered to the "Look-at-me
                      first" folder.

                      Regards ... Alec -- buralex-gmail
                      --



                      [Non-text portions of this message have been removed]
                    Your message has been successfully submitted and would be delivered to recipients shortly.