Loading ...
Sorry, an error occurred while loading the content.

add commas in address string

Expand Messages
  • Mike Breiding - Morgantown WV
    Postal code: I need to replace the space in front of the first number with a comma State abbreviation: I need to replace the space in front of the first letter
    Message 1 of 25 , Jul 1, 2011
    • 0 Attachment
      Postal code:
      I need to replace the space in front of the first number with a comma

      State abbreviation:
      I need to replace the space in front of the first letter with a comma

      City :
      I need to replace the space in front of the first letter with a comma


      125 N. Main Street City ST 11111

      Then I will paste into Excel

      Do-able?


      Thanks,
      -Mike
    • John Shotsky
      State and zip are easy. City has no distinguishing part to separate it from the address. If there are a limited number of cities, you could use a look up
      Message 2 of 25 , Jul 1, 2011
      • 0 Attachment
        State and zip are easy. City has no distinguishing part to separate it from the address. If there are a limited number
        of cities, you could use a look up table. But even that could cause problems, if a street name is also used in a city
        name. I'd suggest you do the state and zip using regex, but add the comma for the city manually.



        Regards,

        John





        From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Mike Breiding - Morgantown WV
        Sent: Friday, July 01, 2011 06:06
        To: NoteTab Clips
        Subject: [Clip] add commas in address string






        Postal code:
        I need to replace the space in front of the first number with a comma

        State abbreviation:
        I need to replace the space in front of the first letter with a comma

        City :
        I need to replace the space in front of the first letter with a comma

        125 N. Main Street City ST 11111

        Then I will paste into Excel

        Do-able?

        Thanks,
        -Mike





        [Non-text portions of this message have been removed]
      • Mike Breiding - Morgantown WV
        ... Thanks, John. Will probably just do as much as I can with S&R and then finish up manually. Thanks, -Mike
        Message 3 of 25 , Jul 1, 2011
        • 0 Attachment
          On 7/1/2011 9:19 AM, John Shotsky wrote:
          > State and zip are easy. City has no distinguishing part to separate it
          > from the address. If there are a limited number
          > of cities, you could use a look up table. But even that could cause
          > problems, if a street name is also used in a city
          > name. I'd suggest you do the state and zip using regex, but add the
          > comma for the city manually. Regards, John

          Thanks, John.
          Will probably just do as much as I can with S&R and then finish up manually.
          Thanks,
          -Mike
          ==================

          >
          > From: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>
          > [mailto:ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>]
          > On Behalf Of Mike Breiding - Morgantown WV
          > Sent: Friday, July 01, 2011 06:06
          > To: NoteTab Clips
          > Subject: [Clip] add commas in address string
          >
          > Postal code:
          > I need to replace the space in front of the first number with a comma
          >
          > State abbreviation:
          > I need to replace the space in front of the first letter with a comma
          >
          > City :
          > I need to replace the space in front of the first letter with a comma
          >
          > 125 N. Main Street City ST 11111
          >
          > Then I will paste into Excel
          >
          > Do-able?
          > Thanks,
          > -Mike
        • Don
          Most cities are one word ... so just insert and then make a clip that goes line by line and asks if you want the comma moved on each line ... way faster.
          Message 4 of 25 , Jul 1, 2011
          • 0 Attachment
            Most cities are one word ... so just insert and then make a clip that
            goes line by line and asks if you want the comma moved on each line ...
            way faster.

            > Thanks, John.
            > Will probably just do as much as I can with S&R and then finish up manually.
            > Thanks,
            > -Mike
          • Mike Breiding - Morgantown WV
            ... Easy for you to say! ;) -mb
            Message 5 of 25 , Jul 1, 2011
            • 0 Attachment
              On 7/1/2011 11:55 AM, Don wrote:
              > Most cities are one word ... so just insert and then make a clip that
              > goes line by line and asks if you want the comma moved on each line ...
              > way faster.

              Easy for you to say! ;)
              -mb


              >> Thanks, John.
              >> Will probably just do as much as I can with S&R and then finish up manually.
              >> Thanks,
              >> -Mike
              >
              >
              > ------------------------------------
              >
              > Fookes Software: http://www.fookes.com/
              > NoteTab website: http://www.notetab.com/
              > NoteTab Discussion Lists: http://www.notetab.com/groups.php
              >
              > ***
              > Yahoo! Groups Links
              >
              >
              >
              >
              >
            • diodeom
              ... After inserting commas around the state name abbreviations, e.g. with: ^!Replace h+([A-Z]{2}) h+(?= d{5} b) ,$1, WARS you could probably automate a
              Message 6 of 25 , Jul 1, 2011
              • 0 Attachment
                Mike Breiding - Morgantown WV wrote:
                >
                > Postal code:
                > I need to replace the space in front of the first number with a comma
                >
                > State abbreviation:
                > I need to replace the space in front of the first letter with a comma
                >
                > City :
                > I need to replace the space in front of the first letter with a comma
                >

                After inserting commas around the state name abbreviations, e.g. with:

                ^!Replace "\h+([A-Z]{2})\h+(?=\d{5}\b)" >> ",$1," WARS

                you could probably automate a lot of fishing for city *candidates* by searching for street suffixes which commonly end the street address part. The clip could prompt you to approve (or skip) comma placement for each probable spot. The hefty list of suffixes for the simple search pattern is a raw list (I didn't bother streamlining) I grabbed from http://www.usps.com/ncsc/lookups/usps_abbreviations.html

                ;single looong line; no newline but space after ^!Set)
                ^!Set %suff%=allee|alley|ally|aly|anex|annex|annx|anx|arc|arcade|av|ave|aven|avenu|avenue|avn|avnue|bayoo|bayou|bch|beach|bend|bg|bgs|blf|blfs|bluf|bluff|bluffs|blvd|bnd|bot|bottm|bottom|boul|boulevard|boulv|br|branch|brdge|brg|bridge|brk|brks|brnch|brook|brooks|btm|burg|burgs|byp|bypa|bypas|bypass|byps|byu|camp|canyn|canyon|cape|causeway|causway|cen|cent|center|centers|centr|centre|cir|circ|circl|circle|circles|cirs|clb|clf|clfs|cliff|cliffs|club|cmn|cmns|cmp|cnter|cntr|cnyn|common|commons|cor|corner|corners|cors|course|court|courts|cove|coves|cp|cpe|crcl|crcle|creek|cres|crescent|crest|crk|crossing|crossroad|crossroads|crse|crsent|crsnt|crssing|crssng|crst|cswy|ct|ctr|ctrs|cts|curv|curve|cv|cvs|cyn|dale|dam|div|divide|dl|dm|dr|driv|drive|drives|drs|drv|dv|dvd|est|estate|estates|ests|exp|expr|express|expressway|expw|expy|ext|extension|extensions|extn|extnsn|exts|fall|falls|ferry|field|fields|flat|flats|fld|flds|fls|flt|flts|ford|fords|forest|forests|forg|forge|forges|fork|forks|fort|frd|frds|freeway|freewy|frg|frgs|frk|frks|frry|frst|frt|frway|frwy|fry|ft|fwy|garden|gardens|gardn|gateway|gatewy|gatway|gdn|gdns|glen|glens|gln|glns|grden|grdn|grdns|green|greens|grn|grns|grov|grove|groves|grv|grvs|gtway|gtwy|harb|harbor|harbors|harbr|haven|hbr|hbrs|heights|highway|highwy|hill|hills|hiway|hiwy|hl|hllw|hls|hollow|hollows|holw|holws|hrbor|ht|hts|hvn|hway|hwy|inlet|inlt|is|island|islands|isle|isles|islnd|islnds|iss|jct|jction|jctn|jctns|jcts|junction|junctions|junctn|juncton|key|keys|knl|knls|knol|knoll|knolls|ky|kys|lake|lakes|land|landing|lane|lck|lcks|ldg|ldge|lf|lgt|lgts|light|lights|lk|lks|ln|lndg|lndng|loaf|lock|locks|lodg|lodge|loop|loops|mall|manor|manors|mdw|mdws|meadow|meadows|medows|mews|mill|mills|mission|missn|ml|mls|mnr|mnrs|mnt|mntain|mntn|mntns|motorway|mount|mountain|mountains|mountin|msn|mssn|mt|mtin|mtn|mtns|mtwy|nck|neck|opas|orch|orchard|orchrd|oval|overpass|ovl|park|parks|parkway|parkways|parkwy|pass|passage|path|paths|pike|pikes|pine|pines|pkway|pkwy|pkwys|pky|pl|place|plain|plains|plaza|pln|plns|plz|plza|pne|pnes|point|points|port|ports|pr|prairie|prk|prr|prt|prts|psge|pt|pts|rad|radial|radiel|radl|ramp|ranch|ranches|rapid|rapids|rd|rdg|rdge|rdgs|rds|rest|ridge|ridges|riv|river|rivr|rnch|rnchs|road|roads|route|row|rpd|rpds|rst|rte|rue|run|rvr|shl|shls|shoal|shoals|shoar|shoars|shore|shores|shr|shrs|skwy|skyway|smt|spg|spgs|spng|spngs|spring|springs|sprng|sprngs|spur|spurs|sq|sqr|sqre|sqrs|sqs|squ|square|squares|st|sta|station|statn|stn|str|stra|strav|straven|stravenue|stravn|stream|street|streets|streme|strm|strt|strvn|strvnue|sts|sumit|sumitt|summit|ter|terr|terrace|throughway|tpke|trace|traces|track|tracks|trafficway|trail|trailer|trails|trak|trce|trfy|trk|trks|trl|trlr|trlrs|trls|trnpk|trwy|tunel|tunl|tunls|tunnel|tunnels|tunnl|turnpike|turnpk|un|underpass|union|unions|uns|upas|valley|valleys|vally|vdct|via|viadct|viaduct|view|views|vill|villag|village|villages|ville|villg|villiage|vis|vist|vista|vl|vlg|vlgs|vlly|vly|vlys|vst|vsta|vw|vws|walk|walks|wall|way|ways|well|wells|wl|wls|wy|xing|xrd|xrds
                ^!Jump 1
                :Loop
                ^!Find "\b(^%suff%)\b\.?\K\h+" RIS
                ^!IfError Done
                ^!SetView ^$Calc(^$GetRow$-3)$:1
                ^!Skip Comma here?
                ^!Goto Loop
                ^!InsertText ,
                ^!Goto Loop
                :Done
                ^!Set %suff%=


                If you're not discouraged by a number of false positives and having to click "No" many times, I'd suggest next to look for instances of digits (as in "Suite 245 Alpharetta,GA,30004") just before non-digit fragments that precede comma and state abbreviation. Basically, the same clip recycled from the ^!Jump line, with the search statement altered to:

                ^!Find "\d\K\h+(?=[^,0-9]+,[A-Z]{2},)" RS

                This is all under the assumption that the only separations between your address elements are spaces and there is nothing better for Regex to bite on...
              • flo.gehrke
                ... I looked it up in Wiki and found that the US Postal Code is a nine-digit ZIP code, separated with - between position #5 and #6, and there are two spaces
                Message 7 of 25 , Jul 1, 2011
                • 0 Attachment
                  --- In ntb-clips@yahoogroups.com, Mike Breiding - Morgantown WV <mike@...> wrote:
                  >
                  >
                  > Postal code:
                  > I need to replace the space in front of the first number with a comma
                  >
                  > State abbreviation:
                  > I need to replace the space in front of the first letter with a comma
                  >
                  > City :
                  > I need to replace the space in front of the first letter with a comma
                  >
                  >
                  > 125 N. Main Street City ST 11111
                  >
                  > Then I will paste into Excel

                  I looked it up in Wiki and found that the US Postal Code is a nine-digit ZIP code, separated with '-' between position #5 and #6, and there are two spaces between state code and ZIP code -- isn't it?

                  That is, in order to change a list of addresses like...

                  1500 E. Main Ave Springfield VA 22262-1010
                  1500 E. Main Ave Springfield VA 22262-1010
                  1500 E. Main Ave Springfield VA 22262-1010

                  to...

                  1500 E. Main Ave,Springfield,VA,22262-1010
                  1500 E. Main Ave,Springfield,VA,22262-1010
                  1500 E. Main Ave,Springfield,VA,22262-1010

                  you could try...

                  ^!Replace "(?x-i)\x20{1,2} ( ((?=\d{5}-\d{4}$)) | ((?=[[:upper:]]{2},\d{5}-\d{4}$)) | ((?=\w{1,},[[:upper:]]{2},\d{5}-\d{4}$)) )" >> "," WARS
                  ; End of long line
                  ^!IfError End
                  ^!Goto Skip_-2

                  It's written in Extended Mode to make the subpatterns more visible.

                  Maybe it's more readable if we split the long alternation into three command lines...

                  ; Match two spaces in front of ZIP code
                  ^!Replace "\x20{2}(?=\d{5}-\d{4}$)" >> "," WARS
                  ; Match one space in front of state code
                  ^!Replace "(?-i)\x20(?=[[:upper:]]{2},\d{5})" >> "," WARS
                  ; Match one space in front of city
                  ^!Replace "(?-i)\x20(?=\w{1,},[[:upper:]]{2},\d{5})" >> "," WARS

                  Regards,
                  Flo
                • Mike Breiding - Morgantown WV
                  ... WHEW! I formatted the clip below into a single line with the required space. But, when I ran it I got not response. I just noticed once I ran John
                  Message 8 of 25 , Jul 1, 2011
                  • 0 Attachment
                    On 7/1/2011 12:55 PM, diodeom wrote:
                    > Mike Breiding - Morgantown WV wrote:
                    > >
                    > > Postal code:
                    > > I need to replace the space in front of the first number with a comma
                    > >
                    > > State abbreviation:
                    > > I need to replace the space in front of the first letter with a comma
                    > >
                    > > City :
                    > > I need to replace the space in front of the first letter with a comma
                    > >
                    >
                    > After inserting commas around the state name abbreviations, e.g. with:
                    >
                    > ^!Replace "\h+([A-Z]{2})\h+(?=\d{5}\b)" >> ",$1," WARS
                    >
                    > you could probably automate a lot of fishing for city *candidates* by
                    > searching for street suffixes which commonly end the street address
                    > part. The clip could prompt you to approve (or skip) comma placement for
                    > each probable spot. The hefty list of suffixes for the simple search
                    > pattern is a raw list (I didn't bother streamlining) I grabbed from
                    > http://www.usps.com/ncsc/lookups/usps_abbreviations.html


                    WHEW! I formatted the clip below into a single line with the required
                    space. But, when I ran it I got not response.

                    I just noticed once I ran John Shotsky's clip all the cities were
                    preceded by 2 spaces along with a *few* other like "PO Box.
                    I manually fixed the letter and then just did a S&R from two spaces to a
                    comma.

                    I would still like to get this right as I have another list of addresses
                    I need to convert to .CSV
                    See: http://epicroadtrips.com/parks_list/
                    It was extracted from PDF and is a bit of a mess.
                    Thanks to all.
                    -Mike



                    >
                    > ;single looong line; no newline but space after ^!Set)
                    > ^!Set
                    > %suff%=allee|alley|ally|aly|anex|annex|annx|anx|arc|arcade|av|ave|aven|avenu|avenue|avn|avnue|bayoo|bayou|bch|beach|bend|bg|bgs|blf|blfs|bluf|bluff|bluffs|blvd|bnd|bot|bottm|bottom|boul|boulevard|boulv|br|branch|brdge|brg|bridge|brk|brks|brnch|brook|brooks|btm|burg|burgs|byp|bypa|bypas|bypass|byps|byu|camp|canyn|canyon|cape|causeway|causway|cen|cent|center|centers|centr|centre|cir|circ|circl|circle|circles|cirs|clb|clf|clfs|cliff|cliffs|club|cmn|cmns|cmp|cnter|cntr|cnyn|common|commons|cor|corner|corners|cors|course|court|courts|cove|coves|cp|cpe|crcl|crcle|creek|cres|crescent|crest|crk|crossing|crossroad|crossroads|crse|crsent|crsnt|crssing|crssng|crst|cswy|ct|ctr|ctrs|cts|curv|curve|cv|cvs|cyn|dale|dam|div|divide|dl|dm|dr|driv|drive|drives|drs|drv|dv|dvd|est|estate|estates|ests|exp|expr|express|expressway|expw|expy|ext|extension|extensions|extn|extnsn|exts|fall|falls|ferry|field|fields|flat|flats|fld|flds|fls|flt|flts|ford|fords|forest|forests|forg|forge|forges|fork|forks
                    |fort|frd|frds|freeway|freewy|frg|frgs|frk|frks|frry|frst|frt|frway|frwy|fry|ft|fwy|garden|gardens|gardn|gateway|gatewy|gatway|gdn|gdns|glen|glens|gln|glns|grden|grdn|grdns|green|greens|grn|grns|grov|grove|groves|grv|grvs|gtway|gtwy|harb|harbor|harbors|harbr|haven|hbr|hbrs|heights|highway|highwy|hill|hills|hiway|hiwy|hl|hllw|hls|hollow|hollows|holw|holws|hrbor|ht|hts|hvn|hway|hwy|inlet|inlt|is|island|islands|isle|isles|islnd|islnds|iss|jct|jction|jctn|jctns|jcts|junction|junctions|junctn|juncton|key|keys|knl|knls|knol|knoll|knolls|ky|kys|lake|lakes|land|landing|lane|lck|lcks|ldg|ldge|lf|lgt|lgts|light|lights|lk|lks|ln|lndg|lndng|loaf|lock|locks|lodg|lodge|loop|loops|mall|manor|manors|mdw|mdws|meadow|meadows|medows|mews|mill|mills|mission|missn|ml|mls|mnr|mnrs|mnt|mntain|mntn|mntns|motorway|mount|mountain|mountains|mountin|msn|mssn|mt|mtin|mtn|mtns|mtwy|nck|neck|opas|orch|orchard|orchrd|oval|overpass|ovl|park|parks|parkway|parkways|parkwy|pass|passage|path|paths|pike|pikes|pin
                    e|pines|pkway|pkwy|pkwys|pky|pl|place|plain|plains|plaza|pln|plns|plz|plza|pne|pnes|point|points|port|ports|pr|prairie|prk|prr|prt|prts|psge|pt|pts|rad|radial|radiel|radl|ramp|ranch|ranches|rapid|rapids|rd|rdg|rdge|rdgs|rds|rest|ridge|ridges|riv|river|rivr|rnch|rnchs|road|roads|route|row|rpd|rpds|rst|rte|rue|run|rvr|shl|shls|shoal|shoals|shoar|shoars|shore|shores|shr|shrs|skwy|skyway|smt|spg|spgs|spng|spngs|spring|springs|sprng|sprngs|spur|spurs|sq|sqr|sqre|sqrs|sqs|squ|square|squares|st|sta|station|statn|stn|str|stra|strav|straven|stravenue|stravn|stream|street|streets|streme|strm|strt|strvn|strvnue|sts|sumit|sumitt|summit|ter|terr|terrace|throughway|tpke|trace|traces|track|tracks|trafficway|trail|trailer|trails|trak|trce|trfy|trk|trks|trl|trlr|trlrs|trls|trnpk|trwy|tunel|tunl|tunls|tunnel|tunnels|tunnl|turnpike|turnpk|un|underpass|union|unions|uns|upas|valley|valleys|vally|vdct|via|viadct|viaduct|view|views|vill|villag|village|villages|ville|villg|villiage|vis|vist|vista|vl
                    |vlg|vlgs|vlly|vly|vlys|vst|vsta|vw|vws|walk|walks|wall|way|ways|well|wells|wl|wls|wy|xing|xrd|xrds
                    > ^!Jump 1
                    > :Loop
                    > ^!Find "\b(^%suff%)\b\.?\K\h+" RIS
                    > ^!IfError Done
                    > ^!SetView ^$Calc(^$GetRow$-3)$:1
                    > ^!Skip Comma here?
                    > ^!Goto Loop
                    > ^!InsertText ,
                    > ^!Goto Loop
                    > :Done
                    > ^!Set %suff%=
                    >
                    > If you're not discouraged by a number of false positives and having to
                    > click "No" many times, I'd suggest next to look for instances of digits
                    > (as in "Suite 245 Alpharetta,GA,30004") just before non-digit fragments
                    > that precede comma and state abbreviation. Basically, the same clip
                    > recycled from the ^!Jump line, with the search statement altered to:
                    >
                    > ^!Find "\d\K\h+(?=[^,0-9]+,[A-Z]{2},)" RS
                    >
                    > This is all under the assumption that the only separations between your
                    > address elements are spaces and there is nothing better for Regex to
                    > bite on...
                    >
                  • diodeom
                    ... A bit of a mess is a bit of an understatement. There is some consistency in this disorder, but it s incomparably easier for me to just send you the fixed
                    Message 9 of 25 , Jul 1, 2011
                    • 0 Attachment
                      Mike Breiding - Morgantown WV wrote:
                      >
                      > WHEW!
                      >
                      > See: http://epicroadtrips.com/parks_list/
                      > It was extracted from PDF and is a bit of a mess.
                      >

                      A bit of a mess is a bit of an understatement. There is some consistency in this disorder, but it's incomparably easier for me to just send you the fixed file rather than attempt to package a multitude of swap routines in some foolproof way.
                    • Don
                      ... Use a pdf extractor -- great program -- can preserve spacing when extracting.
                      Message 10 of 25 , Jul 1, 2011
                      • 0 Attachment
                        > I would still like to get this right as I have another list of addresses
                        > I need to convert to .CSV
                        > See: http://epicroadtrips.com/parks_list/
                        > It was extracted from PDF and is a bit of a mess.
                        > Thanks to all.
                        > -Mike

                        Use a pdf extractor -- great program -- can preserve spacing when
                        extracting.
                      • Mike Breiding - Morgantown WV
                        ... Hi Don, I used Acro 9.0 to export to text and that is what I got. Not sure if Acro 9 is sloppy or the original had flaws . -mb
                        Message 11 of 25 , Jul 2, 2011
                        • 0 Attachment
                          On 7/1/2011 10:40 PM, Don wrote:
                          >> I would still like to get this right as I have another list of addresses
                          >> I need to convert to .CSV
                          >> See: http://epicroadtrips.com/parks_list/
                          >> It was extracted from PDF and is a bit of a mess.
                          >> Thanks to all.
                          >> -Mike
                          >
                          > Use a pdf extractor -- great program -- can preserve spacing when
                          > extracting.

                          Hi Don,
                          I used Acro 9.0 to export to text and that is what I got.
                          Not sure if Acro 9 is sloppy or the original had "flaws".
                          -mb
                        • Mike Breiding - Morgantown WV
                          ... Here are the results! http://epicroadtrips.com/parks_list/dio_NPS_Units.txt Many thanks, -Mike
                          Message 12 of 25 , Jul 2, 2011
                          • 0 Attachment
                            On 7/1/2011 4:03 PM, diodeom wrote:

                            > A bit of a mess is a bit of an understatement. There is some consistency
                            > in this disorder, but it's incomparably easier for me to just send you
                            > the fixed file rather than attempt to package a multitude of swap
                            > routines in some foolproof way.


                            Here are the results!
                            http://epicroadtrips.com/parks_list/dio_NPS_Units.txt

                            Many thanks,
                            -Mike
                          • Axel Berger
                            ... The problem is, there are no spaces in PDF, just letters and places where to put them. So as letters usually don t touch any extractor including copying
                            Message 13 of 25 , Jul 2, 2011
                            • 0 Attachment
                              Don wrote:
                              > Use a pdf extractor -- great program -- can preserve spacing when
                              > extracting.

                              The problem is, there are no spaces in PDF, just letters and places
                              where to put them. So as letters usually don't touch any extractor
                              including copying out of Acrobat Reader has to guess, whether there is
                              just the normal distance between adjacent letters or an extra space
                              there. Of course some programs, often using dictionaries and other help,
                              are better at guessing than others.

                              Axel
                            • Art Kocsis
                              ... with a comma ... letter with a comma ... Mike, Whenever I have a one-off project such as this that does not lend itself to a simple clip I find it is much
                              Message 14 of 25 , Jul 2, 2011
                              • 0 Attachment
                                At 07/01/2011 06:06, Mike Breiding wrote:
                                > Postal code: I need to replace the space in front of the first number
                                with a comma
                                > State abbreviation: I need to replace the space in front of the first
                                letter with a comma
                                > City : I need to replace the space in front of the first letter with a comma

                                At 07/01/2011 11:21, Mike Breiding wrote:
                                >I would still like to get this right as I have another list of addresses
                                >See: http://epicroadtrips.com/parks_list/
                                >It was extracted from PDF and is a bit of a mess.

                                Mike,

                                Whenever I have a one-off project such as this that does not lend itself to
                                a simple clip I find it is much more efficient to just manually convert the
                                file to a fixed field format by making use of NTB's block edit functions.
                                Your parks_list file would only take about 5-10 minutes to convert. (In
                                fact, writing this response took much longer than it would have taken to
                                convert the entire file!<g>)

                                In essence:

                                - First I convert (via simple clips), all tabs to spaces and
                                all \R to \r\n (RegEx replace statement)
                                - Next, starting from the left, I insert (via S&R), a large number
                                of spaces in front of each desired field
                                - Next, I find the longest entry in the left most field and set the cursor
                                a few spaces beyond that col position on the last row and SHFT-Click
                                at the beginning of the first row.
                                -Then right click and select "Cut Block" then "Left Align", move the cursor
                                to the start of line 1 and right click "Paste Block". (The Left Align
                                is a little trick to quickly align the next field for the entire file
                                as it is already selected.)
                                - Repeat for the next left-most variable length field.

                                After all fields have been converted, it is now simple to insert any field
                                separator desired, re-arrange, extract or delete fields, etc. In your case,
                                change all "| " to ", " and delete all multiple spaces.

                                For small numbers of lines, instead of the block cut & paste, I set the
                                cursor just beyond the longest entry but on the first line and use a simple
                                clip to do multiple CNTL-DEL, DOWN-ARROWs (about 5 or 10 per invocation).

                                ^!Keyboard CTRL+DELETE
                                ^!Keyboard Down
                                ^!Keyboard CTRL+DELETE
                                ^!Keyboard Down
                                ^!Keyboard CTRL+DELETE
                                ^!Keyboard Down
                                etc

                                I put all three clips (tabs, new lines, Del&Down) on my toolbar for one
                                click operation and include the right click functions in my Shortcut Menu
                                (View | Options | Shortcut Menu)

                                An example in gory detail follows:
                                (Alternately, Since there are a lot of long lines, download the example
                                file from:
                                http://www.box.net/shared/998tzn0qhdi23hbvgtku)

                                Comments on the RegEx pattern used:
                                Even though the \K removes the preceding pattern from the capture,
                                the "(...)" still counts as a subpattern.
                                The leading \R is a work around for the mishandling of "^" and
                                requires an initial empty line
                                It would REALLY be nice if the "^" operator worked as per spec
                                instead of being ignored by NTB! Then ^(.*?\|){4}\K(.*?$) would
                                be the preferred pattern and not require an initial empty line.


                                Sample extract from http://epicroadtrips.com/parks_list/ (long lines)
                                --------------------------------------------------------
                                BANDELIER|NM|Jason Lott|15 Entrance Road|Los
                                Alamos|NM|87544-9508|505-672-3861|505-672-9714
                                BIG BEND|NP|Bill Wellman|P.O. Box 129|Big
                                Bend|TX|79834-0129|432-477-2251|432-477-1172
                                BIG HOLE|NB|Stephen Black|P.O. Box 237; 16425 Highway 43
                                West|Wisdom|MT|59761-0137|406-689-3155|406-689-3151
                                BIG SOUTH FORK|NR&RA|Stennis Young|4564 Leatherwood
                                Road|Oneida|TN|37841|423-569-9778|423-569-5505
                                BLUE RIDGE PARKWAY|PW|Phil Francis|199 Hemphill Knob
                                Road|Asheville|NC|28803|828-271-4779|828-271-4117
                                HERBERT HOOVER|NHS|Cheryl Schreier|P.O. Box 607; 110 Parkside Drive|West
                                Branch|IA|52358-0607|319-643-2541|319-643-5367

                                After spaces insertion before address field (after 3 "|"s)
                                using: "\R(.*?\|){3}\K(.*?$)" >> " $2"
                                BANDELIER|NM|Jason Lott| 15 Entrance Road|Los
                                Alamos|NM|87544-9508|505-672-3861|505-672-9714
                                BIG BEND|NP|Bill Wellman| P.O. Box 129|Big
                                Bend|TX|79834-0129|432-477-2251|432-477-1172
                                BIG HOLE|NB|Stephen Black| P.O. Box 237;
                                16425 Highway 43 West|Wisdom|MT|59761-0137|406-689-3155|406-689-3151
                                BIG SOUTH FORK|NR&RA|Stennis Young| 4564
                                Leatherwood Road|Oneida|TN|37841|423-569-9778|423-569-5505
                                BLUE RIDGE PARKWAY|PW|Phil Francis| 199
                                Hemphill Knob Road|Asheville|NC|28803|828-271-4779|828-271-4117
                                HERBERT HOOVER|NHS|Cheryl Schreier| P.O. Box
                                607; 110 Parkside Drive|West Branch|IA|52358-0607|319-643-2541|319-643-5367

                                After Cut Block, Left Align & Paste Block:
                                BANDELIER|NM|Jason Lott| 15 Entrance Road|Los
                                Alamos|NM|87544-9508|505-672-3861|505-672-9714
                                BIG BEND|NP|Bill Wellman| P.O. Box 129|Big
                                Bend|TX|79834-0129|432-477-2251|432-477-1172
                                BIG HOLE|NB|Stephen Black| P.O. Box 237; 16425 Highway 43
                                West|Wisdom|MT|59761-0137|406-689-3155|406-689-3151
                                BIG SOUTH FORK|NR&RA|Stennis Young| 4564 Leatherwood
                                Road|Oneida|TN|37841|423-569-9778|423-569-5505
                                BLUE RIDGE PARKWAY|PW|Phil Francis| 199 Hemphill Knob
                                Road|Asheville|NC|28803|828-271-4779|828-271-4117
                                HERBERT HOOVER|NHS|Cheryl Schreier| P.O. Box 607; 110 Parkside Drive|West
                                Branch|IA|52358-0607|319-643-2541|319-643-5367


                                Repeating for spaces insertion before city field (after 4 "|"s)
                                using: "\R(.*?\|){4}\K(.*?$)" >> " $2"
                                BANDELIER|NM|Jason Lott| 15 Entrance
                                Road| Los
                                Alamos|NM|87544-9508|505-672-3861|505-672-9714
                                BIG BEND|NP|Bill Wellman| P.O. Box
                                129| Big
                                Bend|TX|79834-0129|432-477-2251|432-477-1172
                                BIG HOLE|NB|Stephen Black| P.O. Box 237; 16425 Highway 43
                                West|
                                Wisdom|MT|59761-0137|406-689-3155|406-689-3151
                                BIG SOUTH FORK|NR&RA|Stennis Young| 4564 Leatherwood
                                Road| Oneida|TN|37841|423-569-9778|423-569-5505
                                BLUE RIDGE PARKWAY|PW|Phil Francis| 199 Hemphill Knob
                                Road|
                                Asheville|NC|28803|828-271-4779|828-271-4117
                                HERBERT HOOVER|NHS|Cheryl Schreier| P.O. Box 607; 110 Parkside
                                Drive| West
                                Branch|IA|52358-0607|319-643-2541|319-643-5367

                                After Cut Block, Left Align & Paste Block:
                                BANDELIER|NM|Jason Lott| 15 Entrance
                                Road| Los Alamos|NM|87544-9508|505-672-3861|505-672-9714
                                BIG BEND|NP|Bill Wellman| P.O. Box
                                129| Big Bend|TX|79834-0129|432-477-2251|432-477-1172
                                BIG HOLE|NB|Stephen Black| P.O. Box 237; 16425 Highway 43
                                West| Wisdom|MT|59761-0137|406-689-3155|406-689-3151
                                BIG SOUTH FORK|NR&RA|Stennis Young| 4564 Leatherwood
                                Road| Oneida|TN|37841|423-569-9778|423-569-5505
                                BLUE RIDGE PARKWAY|PW|Phil Francis| 199 Hemphill Knob
                                Road| Asheville|NC|28803|828-271-4779|828-271-4117
                                HERBERT HOOVER|NHS|Cheryl Schreier| P.O. Box 607; 110 Parkside
                                Drive| West Branch|IA|52358-0607|319-643-2541|319-643-5367
                              • Mike Breiding - Morgantown WV
                                Interesting. Your thorough explanation has caused me to look at this more carefully. I will give your method a try and see what I end up with. I am sure I will
                                Message 15 of 25 , Jul 3, 2011
                                • 0 Attachment
                                  Interesting.
                                  Your thorough explanation has caused me to look at this more carefully.
                                  I will give your method a try and see what I end up with.
                                  I am sure I will have more questions.

                                  Thanks for taking the time to write this up.
                                  -Mike
                                  ==========================

                                  On 7/2/2011 9:40 PM, Art Kocsis wrote:
                                  > At 07/01/2011 06:06, Mike Breiding wrote:
                                  > > Postal code: I need to replace the space in front of the first number
                                  > with a comma
                                  > > State abbreviation: I need to replace the space in front of the first
                                  > letter with a comma
                                  > > City : I need to replace the space in front of the first letter with
                                  > a comma
                                  >
                                  > At 07/01/2011 11:21, Mike Breiding wrote:
                                  > >I would still like to get this right as I have another list of addresses
                                  > >See: http://epicroadtrips.com/parks_list/
                                  > >It was extracted from PDF and is a bit of a mess.
                                  >
                                  > Mike,
                                  >
                                  > Whenever I have a one-off project such as this that does not lend itself to
                                  > a simple clip I find it is much more efficient to just manually convert the
                                  > file to a fixed field format by making use of NTB's block edit functions.
                                  > Your parks_list file would only take about 5-10 minutes to convert. (In
                                  > fact, writing this response took much longer than it would have taken to
                                  > convert the entire file!<g>)
                                  >
                                  > In essence:
                                  >
                                  > - First I convert (via simple clips), all tabs to spaces and
                                  > all \R to \r\n (RegEx replace statement)
                                  > - Next, starting from the left, I insert (via S&R), a large number
                                  > of spaces in front of each desired field
                                  > - Next, I find the longest entry in the left most field and set the cursor
                                  > a few spaces beyond that col position on the last row and SHFT-Click
                                  > at the beginning of the first row.
                                  > -Then right click and select "Cut Block" then "Left Align", move the cursor
                                  > to the start of line 1 and right click "Paste Block". (The Left Align
                                  > is a little trick to quickly align the next field for the entire file
                                  > as it is already selected.)
                                  > - Repeat for the next left-most variable length field.
                                  >
                                  > After all fields have been converted, it is now simple to insert any field
                                  > separator desired, re-arrange, extract or delete fields, etc. In your case,
                                  > change all "| " to ", " and delete all multiple spaces.
                                  >
                                  > For small numbers of lines, instead of the block cut & paste, I set the
                                  > cursor just beyond the longest entry but on the first line and use a simple
                                  > clip to do multiple CNTL-DEL, DOWN-ARROWs (about 5 or 10 per invocation).
                                  >
                                  > ^!Keyboard CTRL+DELETE
                                  > ^!Keyboard Down
                                  > ^!Keyboard CTRL+DELETE
                                  > ^!Keyboard Down
                                  > ^!Keyboard CTRL+DELETE
                                  > ^!Keyboard Down
                                  > etc
                                  >
                                  > I put all three clips (tabs, new lines, Del&Down) on my toolbar for one
                                  > click operation and include the right click functions in my Shortcut Menu
                                  > (View | Options | Shortcut Menu)
                                  >
                                  > An example in gory detail follows:
                                  > (Alternately, Since there are a lot of long lines, download the example
                                  > file from:
                                  > http://www.box.net/shared/998tzn0qhdi23hbvgtku)

                                  <SNIP>
                                • Don
                                  ... And A-PDF Extractor thus uses positioning to solve that. Give it a try. I do some pretty complex extractions. This should probably go off topic if it
                                  Message 16 of 25 , Jul 3, 2011
                                  • 0 Attachment
                                    On 7/2/2011 10:22 AM, Axel Berger wrote:
                                    > Don wrote:
                                    >> Use a pdf extractor -- great program -- can preserve spacing when
                                    >> extracting.
                                    >
                                    > The problem is, there are no spaces in PDF, just letters and places
                                    > where to put them. So as letters usually don't touch any extractor
                                    > including copying out of Acrobat Reader has to guess, whether there is
                                    > just the normal distance between adjacent letters or an extra space
                                    > there. Of course some programs, often using dictionaries and other help,
                                    > are better at guessing than others.
                                    >
                                    > Axel

                                    And A-PDF Extractor thus uses positioning to solve that. Give it a try.
                                    I do some pretty complex extractions. This should probably go off
                                    topic if it continues so I'll copy there, but it will preserve the
                                    "non-existent" spacing by using the relative positions of the content.
                                    It then has some positioning meta data which I clean with a simple clip.
                                    I like it a lot and I use it "back and forth" with notetab to extract
                                    content.
                                  • John Shotsky
                                    Don, thinking I might be able to use A-PDF Extractor, I tried it out. If you have any hyphens or dashes in your source document, they are all converted to the
                                    Message 17 of 25 , Jul 3, 2011
                                    • 0 Attachment
                                      Don, thinking I might be able to use A-PDF Extractor, I tried it out. If you have any hyphens or dashes in your source
                                      document, they are all converted to the word 'minus' in the output text. While that 'could' be fixed with a big global
                                      F/R, it would then also remove any valid cases of the word 'minus' at the same time. I did mention it to their tech
                                      support.

                                      Here are a couple of extracted text lines:
                                      Cover and cook on high for 30minus45 minutes
                                      1 scallion minusminus thinly sliced
                                      10 green onions, cut into 3minusinch pieces

                                      I tried all of the various output methods, but this is not affected by any user options. This test was done on a 112
                                      page document with hundreds to thousands of hyphens in the pdf text. Not ONE hyphen remained in the output.

                                      Regards,
                                      John
                                      -----Original Message-----
                                      From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Don
                                      Sent: Friday, July 01, 2011 19:41
                                      To: ntb-clips@yahoogroups.com
                                      Subject: Re: [Clip] Re: add commas in address string

                                      > I would still like to get this right as I have another list of
                                      > addresses I need to convert to .CSV
                                      > See: http://epicroadtrips.com/parks_list/
                                      > It was extracted from PDF and is a bit of a mess.
                                      > Thanks to all.
                                      > -Mike

                                      Use a pdf extractor -- great program -- can preserve spacing when extracting.


                                      ------------------------------------

                                      Fookes Software: http://www.fookes.com/
                                      NoteTab website: http://www.notetab.com/ NoteTab Discussion Lists: http://www.notetab.com/groups.php

                                      ***
                                      Yahoo! Groups Links
                                    • Dave
                                      Hi have you tried the free version of PDF http://www.tracker-software.com/ ??? THANKYOU DAVE ... From: John Shotsky To:
                                      Message 18 of 25 , Jul 4, 2011
                                      • 0 Attachment
                                        Hi
                                        have you tried the free version of PDF http://www.tracker-software.com/
                                        ???
                                        THANKYOU DAVE

                                        ----- Original Message -----
                                        From: "John Shotsky" <jshotsky@...>
                                        To: <ntb-clips@yahoogroups.com>
                                        Sent: Monday, July 04, 2011 3:06 AM
                                        Subject: RE: [Clip] Re: add commas in address string


                                        > Don, thinking I might be able to use A-PDF Extractor, I tried it out. If
                                        > you have any hyphens or dashes in your source
                                        > document, they are all converted to the word 'minus' in the output text.
                                        > While that 'could' be fixed with a big global
                                        > F/R, it would then also remove any valid cases of the word 'minus' at the
                                        > same time. I did mention it to their tech
                                        > support.
                                        >
                                        > Here are a couple of extracted text lines:
                                        > Cover and cook on high for 30minus45 minutes
                                        > 1 scallion minusminus thinly sliced
                                        > 10 green onions, cut into 3minusinch pieces
                                        >
                                        > I tried all of the various output methods, but this is not affected by any
                                        > user options. This test was done on a 112
                                        > page document with hundreds to thousands of hyphens in the pdf text. Not
                                        > ONE hyphen remained in the output.
                                        >
                                        > Regards,
                                        > John
                                        > -----Original Message-----
                                        > From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On
                                        > Behalf Of Don
                                        > Sent: Friday, July 01, 2011 19:41
                                        > To: ntb-clips@yahoogroups.com
                                        > Subject: Re: [Clip] Re: add commas in address string
                                        >
                                        >> I would still like to get this right as I have another list of
                                        >> addresses I need to convert to .CSV
                                        >> See: http://epicroadtrips.com/parks_list/
                                        >> It was extracted from PDF and is a bit of a mess.
                                        >> Thanks to all.
                                        >> -Mike
                                        >
                                        > Use a pdf extractor -- great program -- can preserve spacing when
                                        > extracting.
                                        >
                                        >
                                        > ------------------------------------
                                        >
                                        > Fookes Software: http://www.fookes.com/
                                        > NoteTab website: http://www.notetab.com/ NoteTab Discussion Lists:
                                        > http://www.notetab.com/groups.php
                                        >
                                        > ***
                                        > Yahoo! Groups Links
                                        >
                                        >
                                        >
                                        >
                                        >
                                        > ------------------------------------
                                        >
                                        > Fookes Software: http://www.fookes.com/
                                        > NoteTab website: http://www.notetab.com/
                                        > NoteTab Discussion Lists: http://www.notetab.com/groups.php
                                        >
                                        > ***
                                        > Yahoo! Groups Links
                                        >
                                        >
                                        >
                                      • Eb
                                        Hi Flo, As a bit of trivia, while Wikepedia is correcet, we do indeed have nine digits in our postal code, hardly anyone knows what the last four digits are
                                        Message 19 of 25 , Jul 4, 2011
                                        • 0 Attachment
                                          Hi Flo,

                                          As a bit of trivia, while Wikepedia is correcet, we do indeed have nine digits in our postal code, hardly anyone knows what the last four digits are (except bulk mailers, who are required to use all nine digits in order to get the so-called "presorted-by-zip-code" discounts in mailing fees).

                                          The five-digit-only use will most likely remain in wide-spread use until those of us, who learned the original five digit sysytem in the 60's (?), are long gone, (or (whisper) until the post office starts to charge extra for not using all nine digits, or worse, does not deliver mail without it).


                                          Result: if one is to write software or clips to deal with zip codes (US Postal Codes), one will have to deal with both possibilities.



                                          Cheers


                                          Eb



                                          --- In ntb-clips@yahoogroups.com, "flo.gehrke" <flo.gehrke@...> wrote:
                                          >
                                          > --- In ntb-clips@yahoogroups.com, Mike Breiding - Morgantown WV <mike@> wrote:
                                          > >
                                          > >
                                          > > Postal code:
                                          > > I need to replace the space in front of the first number with a comma
                                          > >
                                          > > State abbreviation:
                                          > > I need to replace the space in front of the first letter with a comma
                                          > >
                                          > > City :
                                          > > I need to replace the space in front of the first letter with a comma
                                          > >
                                          > >
                                          > > 125 N. Main Street City ST 11111
                                          > >
                                          > > Then I will paste into Excel
                                          >
                                          > I looked it up in Wiki and found that the US Postal Code is a nine-digit ZIP code, separated with '-' between position #5 and #6, and there are two spaces between state code and ZIP code -- isn't it?
                                          >
                                          > That is, in order to change a list of addresses like...
                                          >
                                          > 1500 E. Main Ave Springfield VA 22262-1010
                                          > 1500 E. Main Ave Springfield VA 22262-1010
                                          > 1500 E. Main Ave Springfield VA 22262-1010
                                          >
                                          > to...
                                          >
                                          > 1500 E. Main Ave,Springfield,VA,22262-1010
                                          > 1500 E. Main Ave,Springfield,VA,22262-1010
                                          > 1500 E. Main Ave,Springfield,VA,22262-1010
                                          >
                                          > you could try...
                                          >
                                          > ^!Replace "(?x-i)\x20{1,2} ( ((?=\d{5}-\d{4}$)) | ((?=[[:upper:]]{2},\d{5}-\d{4}$)) | ((?=\w{1,},[[:upper:]]{2},\d{5}-\d{4}$)) )" >> "," WARS
                                          > ; End of long line
                                          > ^!IfError End
                                          > ^!Goto Skip_-2
                                          >
                                          > It's written in Extended Mode to make the subpatterns more visible.
                                          >
                                          > Maybe it's more readable if we split the long alternation into three command lines...
                                          >
                                          > ; Match two spaces in front of ZIP code
                                          > ^!Replace "\x20{2}(?=\d{5}-\d{4}$)" >> "," WARS
                                          > ; Match one space in front of state code
                                          > ^!Replace "(?-i)\x20(?=[[:upper:]]{2},\d{5})" >> "," WARS
                                          > ; Match one space in front of city
                                          > ^!Replace "(?-i)\x20(?=\w{1,},[[:upper:]]{2},\d{5})" >> "," WARS
                                          >
                                          > Regards,
                                          > Flo
                                          >
                                        • John Shotsky
                                          As far as I can see, the free version can t export text. The only save option is pdf, and the only export option is image . Removed. Regards, John From:
                                          Message 20 of 25 , Jul 4, 2011
                                          • 0 Attachment
                                            As far as I can see, the free version can't export text. The only save option is pdf, and the only export option is
                                            'image'. Removed.



                                            Regards,

                                            John



                                            From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Dave
                                            Sent: Monday, July 04, 2011 06:29
                                            To: ntb-clips@yahoogroups.com
                                            Subject: Re: [Clip] Re: add commas in address string





                                            Hi
                                            have you tried the free version of PDF http://www.tracker-software.com/
                                            ???
                                            THANKYOU DAVE

                                            ----- Original Message -----
                                            From: "John Shotsky" <jshotsky@... <mailto:jshotsky%40comcast.net> >
                                            To: <ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> >
                                            Sent: Monday, July 04, 2011 3:06 AM
                                            Subject: RE: [Clip] Re: add commas in address string

                                            > Don, thinking I might be able to use A-PDF Extractor, I tried it out. If
                                            > you have any hyphens or dashes in your source
                                            > document, they are all converted to the word 'minus' in the output text.
                                            > While that 'could' be fixed with a big global
                                            > F/R, it would then also remove any valid cases of the word 'minus' at the
                                            > same time. I did mention it to their tech
                                            > support.
                                            >
                                            > Here are a couple of extracted text lines:
                                            > Cover and cook on high for 30minus45 minutes
                                            > 1 scallion minusminus thinly sliced
                                            > 10 green onions, cut into 3minusinch pieces
                                            >
                                            > I tried all of the various output methods, but this is not affected by any
                                            > user options. This test was done on a 112
                                            > page document with hundreds to thousands of hyphens in the pdf text. Not
                                            > ONE hyphen remained in the output.
                                            >
                                            > Regards,
                                            > John
                                            > -----Original Message-----
                                            > From: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> [mailto:ntb-clips@yahoogroups.com
                                            <mailto:ntb-clips%40yahoogroups.com> ] On
                                            > Behalf Of Don
                                            > Sent: Friday, July 01, 2011 19:41
                                            > To: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>
                                            > Subject: Re: [Clip] Re: add commas in address string
                                            >
                                            >> I would still like to get this right as I have another list of
                                            >> addresses I need to convert to .CSV
                                            >> See: http://epicroadtrips.com/parks_list/
                                            >> It was extracted from PDF and is a bit of a mess.
                                            >> Thanks to all.
                                            >> -Mike
                                            >
                                            > Use a pdf extractor -- great program -- can preserve spacing when
                                            > extracting.
                                            >
                                            >
                                            > ------------------------------------
                                            >
                                            > Fookes Software: http://www.fookes.com/
                                            > NoteTab website: http://www.notetab.com/ NoteTab Discussion Lists:
                                            > http://www.notetab.com/groups.php
                                            >
                                            > ***
                                            > Yahoo! Groups Links
                                            >
                                            >
                                            >
                                            >
                                            >
                                            > ------------------------------------
                                            >
                                            > Fookes Software: http://www.fookes.com/
                                            > NoteTab website: http://www.notetab.com/
                                            > NoteTab Discussion Lists: http://www.notetab.com/groups.php
                                            >
                                            > ***
                                            > Yahoo! Groups Links
                                            >
                                            >
                                            >





                                            [Non-text portions of this message have been removed]
                                          • Axel Berger
                                            ... I always use the command line tools from XPDF http://www.foolabs.com/xpdf/home.html They do all they are menat to and quite reliably. I mostly notice lost
                                            Message 21 of 25 , Jul 4, 2011
                                            • 0 Attachment
                                              John Shotsky wrote:
                                              > Don, thinking I might be able to use A-PDF Extractor, I tried it out.

                                              I always use the command line tools from XPDF
                                              http://www.foolabs.com/xpdf/home.html

                                              They do all they are menat to and quite reliably. I mostly notice lost
                                              spaces when copying and pasting from the Acrobat reader and can't say,
                                              if pdftotext is less prone to them. In principle it is rather difficult
                                              as said and forces progress to guess. some willbe better at guessing
                                              than others.

                                              Axel
                                            • Dave
                                              Hi just copy and paste any amount of text you want ?? THANKYOU DAVE M ... From: John Shotsky To: Sent:
                                              Message 22 of 25 , Jul 5, 2011
                                              • 0 Attachment
                                                Hi
                                                just copy and paste any amount of text you want ??
                                                THANKYOU DAVE M

                                                ----- Original Message -----
                                                From: "John Shotsky" <jshotsky@...>
                                                To: <ntb-clips@yahoogroups.com>
                                                Sent: Tuesday, July 05, 2011 6:20 AM
                                                Subject: RE: [Clip] Re: add commas in address string


                                                > As far as I can see, the free version can't export text. The only save
                                                > option is pdf, and the only export option is
                                                > 'image'. Removed.
                                                >
                                                >
                                                >
                                                > Regards,
                                                >
                                                > John
                                                >
                                                >
                                                >
                                                > From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On
                                                > Behalf Of Dave
                                                > Sent: Monday, July 04, 2011 06:29
                                                > To: ntb-clips@yahoogroups.com
                                                > Subject: Re: [Clip] Re: add commas in address string
                                                >
                                                >
                                                >
                                                >
                                                >
                                                > Hi
                                                > have you tried the free version of PDF http://www.tracker-software.com/
                                                > ???
                                                > THANKYOU DAVE
                                                >
                                                > ----- Original Message -----
                                                > From: "John Shotsky" <jshotsky@... <mailto:jshotsky%40comcast.net>
                                                > >
                                                > To: <ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> >
                                                > Sent: Monday, July 04, 2011 3:06 AM
                                                > Subject: RE: [Clip] Re: add commas in address string
                                                >
                                                >> Don, thinking I might be able to use A-PDF Extractor, I tried it out. If
                                                >> you have any hyphens or dashes in your source
                                                >> document, they are all converted to the word 'minus' in the output text.
                                                >> While that 'could' be fixed with a big global
                                                >> F/R, it would then also remove any valid cases of the word 'minus' at the
                                                >> same time. I did mention it to their tech
                                                >> support.
                                                >>
                                                >> Here are a couple of extracted text lines:
                                                >> Cover and cook on high for 30minus45 minutes
                                                >> 1 scallion minusminus thinly sliced
                                                >> 10 green onions, cut into 3minusinch pieces
                                                >>
                                                >> I tried all of the various output methods, but this is not affected by
                                                >> any
                                                >> user options. This test was done on a 112
                                                >> page document with hundreds to thousands of hyphens in the pdf text. Not
                                                >> ONE hyphen remained in the output.
                                                >>
                                                >> Regards,
                                                >> John
                                                >> -----Original Message-----
                                                >> From: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>
                                                >> [mailto:ntb-clips@yahoogroups.com
                                                > <mailto:ntb-clips%40yahoogroups.com> ] On
                                                >> Behalf Of Don
                                                >> Sent: Friday, July 01, 2011 19:41
                                                >> To: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>
                                                >> Subject: Re: [Clip] Re: add commas in address string
                                                >>
                                                >>> I would still like to get this right as I have another list of
                                                >>> addresses I need to convert to .CSV
                                                >>> See: http://epicroadtrips.com/parks_list/
                                                >>> It was extracted from PDF and is a bit of a mess.
                                                >>> Thanks to all.
                                                >>> -Mike
                                                >>
                                                >> Use a pdf extractor -- great program -- can preserve spacing when
                                                >> extracting.
                                                >>
                                                >>
                                                >> ------------------------------------
                                                >>
                                                >> Fookes Software: http://www.fookes.com/
                                                >> NoteTab website: http://www.notetab.com/ NoteTab Discussion Lists:
                                                >> http://www.notetab.com/groups.php
                                                >>
                                                >> ***
                                                >> Yahoo! Groups Links
                                                >>
                                                >>
                                                >>
                                                >>
                                                >>
                                                >> ------------------------------------
                                                >>
                                                >> Fookes Software: http://www.fookes.com/
                                                >> NoteTab website: http://www.notetab.com/
                                                >> NoteTab Discussion Lists: http://www.notetab.com/groups.php
                                                >>
                                                >> ***
                                                >> Yahoo! Groups Links
                                                >>
                                                >>
                                                >>
                                                >
                                                >
                                                >
                                                >
                                                >
                                                > [Non-text portions of this message have been removed]
                                                >
                                                >
                                                >
                                                > ------------------------------------
                                                >
                                                > Fookes Software: http://www.fookes.com/
                                                > NoteTab website: http://www.notetab.com/
                                                > NoteTab Discussion Lists: http://www.notetab.com/groups.php
                                                >
                                                > ***
                                                > Yahoo! Groups Links
                                                >
                                                >
                                                >
                                              • John Shotsky
                                                That would only be 112 copies and pastes for one of my documents. Just a little easier to Save As Text . Besides, this thread was about pdf to text
                                                Message 23 of 25 , Jul 5, 2011
                                                • 0 Attachment
                                                  That would only be 112 copies and pastes for one of my documents. Just a little easier to 'Save As' 'Text'. Besides,
                                                  this thread was about pdf to text exporting that retained column information. Most PDF tools can actually save text, but
                                                  most of them don't do it very well. A-PDF would be perfect, except for its glaring text error of spelling out the word
                                                  'minus' instead of producing a '-'. Their feedback is that they don't have an earlier version, and they don't have an
                                                  update. In other words minusminus they aren't going to fix it any time soon.



                                                  Regards,

                                                  John





                                                  From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Dave
                                                  Sent: Tuesday, July 05, 2011 06:19
                                                  To: ntb-clips@yahoogroups.com
                                                  Subject: Re: [Clip] Re: add commas in address string





                                                  Hi
                                                  just copy and paste any amount of text you want ??
                                                  THANKYOU DAVE M

                                                  ----- Original Message -----
                                                  From: "John Shotsky" <jshotsky@... <mailto:jshotsky%40comcast.net> >
                                                  To: <ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> >
                                                  Sent: Tuesday, July 05, 2011 6:20 AM
                                                  Subject: RE: [Clip] Re: add commas in address string

                                                  > As far as I can see, the free version can't export text. The only save
                                                  > option is pdf, and the only export option is
                                                  > 'image'. Removed.
                                                  >
                                                  >
                                                  >
                                                  > Regards,
                                                  >
                                                  > John
                                                  >
                                                  >
                                                  >
                                                  > From: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> [mailto:ntb-clips@yahoogroups.com
                                                  <mailto:ntb-clips%40yahoogroups.com> ] On
                                                  > Behalf Of Dave
                                                  > Sent: Monday, July 04, 2011 06:29
                                                  > To: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>
                                                  > Subject: Re: [Clip] Re: add commas in address string
                                                  >
                                                  >
                                                  >
                                                  >
                                                  >
                                                  > Hi
                                                  > have you tried the free version of PDF http://www.tracker-software.com/
                                                  > ???
                                                  > THANKYOU DAVE
                                                  >
                                                  > ----- Original Message -----
                                                  > From: "John Shotsky" <jshotsky@... <mailto:jshotsky%40comcast.net> <mailto:jshotsky%40comcast.net>
                                                  > >
                                                  > To: <ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> <mailto:ntb-clips%40yahoogroups.com> >
                                                  > Sent: Monday, July 04, 2011 3:06 AM
                                                  > Subject: RE: [Clip] Re: add commas in address string
                                                  >
                                                  >> Don, thinking I might be able to use A-PDF Extractor, I tried it out. If
                                                  >> you have any hyphens or dashes in your source
                                                  >> document, they are all converted to the word 'minus' in the output text.
                                                  >> While that 'could' be fixed with a big global
                                                  >> F/R, it would then also remove any valid cases of the word 'minus' at the
                                                  >> same time. I did mention it to their tech
                                                  >> support.
                                                  >>
                                                  >> Here are a couple of extracted text lines:
                                                  >> Cover and cook on high for 30minus45 minutes
                                                  >> 1 scallion minusminus thinly sliced
                                                  >> 10 green onions, cut into 3minusinch pieces
                                                  >>
                                                  >> I tried all of the various output methods, but this is not affected by
                                                  >> any
                                                  >> user options. This test was done on a 112
                                                  >> page document with hundreds to thousands of hyphens in the pdf text. Not
                                                  >> ONE hyphen remained in the output.
                                                  >>
                                                  >> Regards,
                                                  >> John
                                                  >> -----Original Message-----
                                                  >> From: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> <mailto:ntb-clips%40yahoogroups.com>
                                                  >> [mailto:ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com>
                                                  > <mailto:ntb-clips%40yahoogroups.com> ] On
                                                  >> Behalf Of Don
                                                  >> Sent: Friday, July 01, 2011 19:41
                                                  >> To: ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> <mailto:ntb-clips%40yahoogroups.com>
                                                  >> Subject: Re: [Clip] Re: add commas in address string
                                                  >>
                                                  >>> I would still like to get this right as I have another list of
                                                  >>> addresses I need to convert to .CSV
                                                  >>> See: http://epicroadtrips.com/parks_list/
                                                  >>> It was extracted from PDF and is a bit of a mess.
                                                  >>> Thanks to all.
                                                  >>> -Mike
                                                  >>
                                                  >> Use a pdf extractor -- great program -- can preserve spacing when
                                                  >> extracting.
                                                  >>
                                                  >>
                                                  >> ------------------------------------
                                                  >>
                                                  >> Fookes Software: http://www.fookes.com/
                                                  >> NoteTab website: http://www.notetab.com/ NoteTab Discussion Lists:
                                                  >> http://www.notetab.com/groups.php
                                                  >>
                                                  >> ***
                                                  >> Yahoo! Groups Links
                                                  >>
                                                  >>
                                                  >>
                                                  >>
                                                  >>
                                                  >> ------------------------------------
                                                  >>
                                                  >> Fookes Software: http://www.fookes.com/
                                                  >> NoteTab website: http://www.notetab.com/
                                                  >> NoteTab Discussion Lists: http://www.notetab.com/groups.php
                                                  >>
                                                  >> ***
                                                  >> Yahoo! Groups Links
                                                  >>
                                                  >>
                                                  >>
                                                  >
                                                  >
                                                  >
                                                  >
                                                  >
                                                  > [Non-text portions of this message have been removed]
                                                  >
                                                  >
                                                  >
                                                  > ------------------------------------
                                                  >
                                                  > Fookes Software: http://www.fookes.com/
                                                  > NoteTab website: http://www.notetab.com/
                                                  > NoteTab Discussion Lists: http://www.notetab.com/groups.php
                                                  >
                                                  > ***
                                                  > Yahoo! Groups Links
                                                  >
                                                  >
                                                  >





                                                  [Non-text portions of this message have been removed]
                                                • John Shotsky
                                                  I, for one, always use all 9 digits. (And I m from before there were even 5 digits!) The first five get you to an area, the last 4 get to your house, apt, etc.
                                                  Message 24 of 25 , Jul 5, 2011
                                                  • 0 Attachment
                                                    I, for one, always use all 9 digits. (And I'm from before there were even 5 digits!) The first five get you to an area,
                                                    the last 4 get to your house, apt, etc. If the 9 digits are present, your mail can be delivered even without a full
                                                    address present.



                                                    Regards,

                                                    John



                                                    From: ntb-clips@yahoogroups.com [mailto:ntb-clips@yahoogroups.com] On Behalf Of Eb
                                                    Sent: Monday, July 04, 2011 12:55
                                                    To: ntb-clips@yahoogroups.com
                                                    Subject: [Clip] Re: add commas in address string





                                                    Hi Flo,

                                                    As a bit of trivia, while Wikepedia is correcet, we do indeed have nine digits in our postal code, hardly anyone knows
                                                    what the last four digits are (except bulk mailers, who are required to use all nine digits in order to get the
                                                    so-called "presorted-by-zip-code" discounts in mailing fees).

                                                    The five-digit-only use will most likely remain in wide-spread use until those of us, who learned the original five
                                                    digit sysytem in the 60's (?), are long gone, (or (whisper) until the post office starts to charge extra for not using
                                                    all nine digits, or worse, does not deliver mail without it).

                                                    Result: if one is to write software or clips to deal with zip codes (US Postal Codes), one will have to deal with both
                                                    possibilities.

                                                    Cheers

                                                    Eb

                                                    --- In ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> , "flo.gehrke" <flo.gehrke@...> wrote:
                                                    >
                                                    > --- In ntb-clips@yahoogroups.com <mailto:ntb-clips%40yahoogroups.com> , Mike Breiding - Morgantown WV <mike@> wrote:
                                                    > >
                                                    > >
                                                    > > Postal code:
                                                    > > I need to replace the space in front of the first number with a comma
                                                    > >
                                                    > > State abbreviation:
                                                    > > I need to replace the space in front of the first letter with a comma
                                                    > >
                                                    > > City :
                                                    > > I need to replace the space in front of the first letter with a comma
                                                    > >
                                                    > >
                                                    > > 125 N. Main Street City ST 11111
                                                    > >
                                                    > > Then I will paste into Excel
                                                    >
                                                    > I looked it up in Wiki and found that the US Postal Code is a nine-digit ZIP code, separated with '-' between position
                                                    #5 and #6, and there are two spaces between state code and ZIP code -- isn't it?
                                                    >
                                                    > That is, in order to change a list of addresses like...
                                                    >
                                                    > 1500 E. Main Ave Springfield VA 22262-1010
                                                    > 1500 E. Main Ave Springfield VA 22262-1010
                                                    > 1500 E. Main Ave Springfield VA 22262-1010
                                                    >
                                                    > to...
                                                    >
                                                    > 1500 E. Main Ave,Springfield,VA,22262-1010
                                                    > 1500 E. Main Ave,Springfield,VA,22262-1010
                                                    > 1500 E. Main Ave,Springfield,VA,22262-1010
                                                    >
                                                    > you could try...
                                                    >
                                                    > ^!Replace "(?x-i)\x20{1,2} ( ((?=\d{5}-\d{4}$)) | ((?=[[:upper:]]{2},\d{5}-\d{4}$)) |
                                                    ((?=\w{1,},[[:upper:]]{2},\d{5}-\d{4}$)) )" >> "," WARS
                                                    > ; End of long line
                                                    > ^!IfError End
                                                    > ^!Goto Skip_-2
                                                    >
                                                    > It's written in Extended Mode to make the subpatterns more visible.
                                                    >
                                                    > Maybe it's more readable if we split the long alternation into three command lines...
                                                    >
                                                    > ; Match two spaces in front of ZIP code
                                                    > ^!Replace "\x20{2}(?=\d{5}-\d{4}$)" >> "," WARS
                                                    > ; Match one space in front of state code
                                                    > ^!Replace "(?-i)\x20(?=[[:upper:]]{2},\d{5})" >> "," WARS
                                                    > ; Match one space in front of city
                                                    > ^!Replace "(?-i)\x20(?=\w{1,},[[:upper:]]{2},\d{5})" >> "," WARS
                                                    >
                                                    > Regards,
                                                    > Flo
                                                    >





                                                    [Non-text portions of this message have been removed]
                                                  • Eb
                                                    I have no defense for not using all 9 digits, other than that the PO has never officially issued me a plus four number . Besides, the PO places bar codes on
                                                    Message 25 of 25 , Jul 7, 2011
                                                    • 0 Attachment
                                                      I have no defense for not using all 9 digits, other than that the PO has never officially issued me a plus four number <g>.

                                                      Besides, the PO places bar codes on all letters, if the originator didn't. This bar code seems to take precedence over zip+four.


                                                      Cheers,


                                                      Eb

                                                      --- In ntb-clips@yahoogroups.com, "John Shotsky" <jshotsky@...> wrote:
                                                      >
                                                      > I, for one, always use all 9 digits. (And I'm from before there were even 5 digits!) The first five get you to an area,
                                                      > the last 4 get to your house, apt, etc. If the 9 digits are present, your mail can be delivered even without a full
                                                      > address present.
                                                      >
                                                      > ...
                                                    Your message has been successfully submitted and would be delivered to recipients shortly.