Loading ...
Sorry, an error occurred while loading the content.

Re: [NTB] PDF files

Expand Messages
  • Jeff Scism
    ... One of the features of the portable Document format in the early days of the format was advertised as copy protection preventing people from copying
    Message 1 of 16 , Dec 19, 2007
    • 0 Attachment
      Matt Clark wrote:
      >
      > Dave S wrote:
      >
      > But, you can export text from a PDF file (from inside the PDF reader) and
      > put it into a text file for "off-line" editing if it's the text you need.
      >
      > What I have is a list of names and addresses in PDF format. If I
      > wanted to add to or correct any of these, how would I export to, say,
      > NoteTab, make the change, and put it back in the original list? I
      > assume I would start by calling the list up in Acrobat Reader. (I
      > don't have the full Adobe Acrobat program.) Thanks for your help.
      >
      > Matt
      >













      One of the "features" of the portable Document format in the early days
      of the format was advertised as "copy protection" preventing people from
      copying from your PDF documents. Since that time there have been several
      PDF capable programs designed by others which enable access to the
      contents of a PDF document.

      In an age where data must not only be communicated, but manipulated,
      this design is a built in flaw. That is why people have tried to work
      around it.

      In my view, PDFs are a hassle. if you have to exert yourself to be able
      to extract from them, it is more like TV than computing.

      I am copying this reply to the Off-topic list,
      ntb-OffTopic@yahoogroups.com, because it isn't on-topic for the basic
      NoteTab list. Please continue there.


      --


      Jeffery G. Scism, IBSSG
      ~~

      Blacksheep Ancestors in your Family?
      'Blacksheep Genealogy' is a registered California Sole Proprietorship.
      The International Black Sheep Society of Genealogists is a Social Organization Identified by its members using IBSSG after their signatures.

      Visit http://ibssg.org/

      For The Blacksheep website, Montgomery County, Putnam County, and Fountain County USGenWeb sites. MORE...
    • Axel Berger
      ... I d say, you re wrong here. PDF is a format for distributing content. If there is my name on something I expect it to stay exactly the way I chose to write
      Message 2 of 16 , Dec 19, 2007
      • 0 Attachment
        Jeff Scism wrote:
        > In an age where data must not only be communicated, but manipulated,
        > this design is a built in flaw. That is why people have tried to
        > work around it.
        >
        > In my view, PDFs are a hassle. if you have to exert yourself to
        > be able to extract from them, it is more like TV than computing.

        I'd say, you're wrong here. PDF is a format for distributing content. If
        there is my name on something I expect it to stay exactly the way I
        chose to write it and noone should be able to to alter it and
        redistribute a fake.
        It is possible to copy content from the document - copy protection is
        silly and can always be circumvented - but text is copied as text, i.e.
        ASCII and graphics are copied as graphics. That is alright, but
        changing, adulterating and corrupting MY text is not.

        If I want to hand out work in progress for others to contribute to, we
        can either agree on one program to use or use any one of a number of
        open formats to do it in. This is not, what PDF is for.

        Axel
      • Sheri
        ... Actually with the full Acrobat program you can make a PDF that others can annotate using Reader. But the annotations don t alter the original, its more
        Message 3 of 16 , Dec 19, 2007
        • 0 Attachment
          --- In ntb-OffTopic@yahoogroups.com, Axel Berger <Axel-Berger@...> wrote:
          >
          > Jeff Scism wrote:
          > > In an age where data must not only be communicated, but manipulated,
          > > this design is a built in flaw. That is why people have tried to
          > > work around it.
          > >
          > > In my view, PDFs are a hassle. if you have to exert yourself to
          > > be able to extract from them, it is more like TV than computing.
          >
          > I'd say, you're wrong here. PDF is a format for distributing
          > content. If there is my name on something I expect it to stay
          > exactly the way I chose to write it and noone should be able to
          > to alter it and redistribute a fake. It is possible to copy
          > content from the document - copy protection is silly and can
          > always be circumvented - but text is copied as text, i.e. ASCII
          > and graphics are copied as graphics. That is alright, but
          > changing, adulterating and corrupting MY text is not.

          > If I want to hand out work in progress for others to contribute
          > to, we can either agree on one program to use or use any one of a
          > number of open formats to do it in. This is not, what PDF is for.

          Actually with the full Acrobat program you can make a PDF that others
          can annotate using Reader. But the annotations don't alter the
          original, its more like a way of adding notes. Also, when creating a
          PDF you can protect it so the text can't be so easily copied to the
          clipboard; e.g., the text is more like graphics. An OCR program can
          still get the text, but that takes work and is not necessarily accurate.

          Regards,
          Sheri
        • Axel Berger
          ... Not in Adobe s own Acrobat reader, that s true, but I have done it often enough with third party software capable of showing PDFs. What can be viewed can
          Message 4 of 16 , Dec 19, 2007
          • 0 Attachment
            Sheri wrote:
            > Also, when creating a PDF you can protect it so the text can't
            > be so easily copied to the clipboard;

            Not in Adobe's own Acrobat reader, that's true, but I have done it often
            enough with third party software capable of showing PDFs. What can be
            viewed can be copied, any attempt to circumvent that basic fact is bound
            to fail - all you can do is make it more of a hassle for the less
            computer literate users.

            Axel
          • Jeff@ibssg.org
            ... Obviously Axel, it depends on the content and the reason for exchanging the document. I am a genealogy researcher, so I often move data from emailed files
            Message 5 of 16 , Dec 19, 2007
            • 0 Attachment
              Axel Berger wrote:
              > Jeff Scism wrote:
              >
              >> In an age where data must not only be communicated, but manipulated,
              >> this design is a built in flaw. That is why people have tried to
              >> work around it.
              >>
              >> In my view, PDFs are a hassle. if you have to exert yourself to
              >> be able to extract from them, it is more like TV than computing.
              >>
              >
              > I'd say, you're wrong here. PDF is a format for distributing content. If
              > there is my name on something I expect it to stay exactly the way I
              > chose to write it and noone should be able to to alter it and
              > redistribute a fake.
              > It is possible to copy content from the document - copy protection is
              > silly and can always be circumvented - but text is copied as text, i.e.
              > ASCII and graphics are copied as graphics. That is alright, but
              > changing, adulterating and corrupting MY text is not.
              >
              > If I want to hand out work in progress for others to contribute to, we
              > can either agree on one program to use or use any one of a number of
              > open formats to do it in. This is not, what PDF is for.
              >
              > Axel
              Obviously Axel, it depends on the content and the reason for exchanging
              the document.

              I am a genealogy researcher, so I often move data from emailed files to
              other programs for working on the data or developing a display of the data.

              I need to be able, without manually transcribing, to move the data, and
              sometimes images into another program, manual transcription introduces
              an opportunity for errors, direct transfer usually keeps it intact.

              Oft time I am sent photos in PDF documents, or photocopied book pages
              for transcribing to text, OCR programs do not like PDF. That is why I
              dislike the format.

              The ONLY nice thing about PDF is that it allows printing without hassle,
              keeping format intact.

              I recently had to manually transcribe a 330 page book (out of
              copyright) for my website. It was sent to me in PDF, and it took weeks
              of searching to find a way to extract the images so I could OCR them. By
              the time I found a program that would do it, I had manually transcribed
              over 150 pages. Yes it was a hassle, and a lot of time could have been
              saved if the ability to extract the images in JPG format was possible. I
              ended up printing out a full run of the book, manually scanning and
              OCR'ing each page.

              Another researcher (who is NOW deceased) sent me his entire research
              library on PDF, 30-40 years of his work, unfortunately he used password
              protection, and the docs will not open. He didn't include the password,
              and I didn't discover it until after he had passed away. Perhaps someday
              I will be able to crack those files.



              Jeff


              --


              Jeffery G. Scism, IBSSG
              ~~

              Blacksheep Ancestors in your Family?
              'Blacksheep Genealogy' is a registered California Sole Proprietorship.
              The International Black Sheep Society of Genealogists is a Social Organization Identified by its members using IBSSG after their signatures.

              Visit http://ibssg.org/

              For The Blacksheep website, Montgomery County, Putnam County, and Fountain County USGenWeb sites. MORE...
            • Axel Berger
              ... Quite. Everything you describe here are purposes for which PDF is particularly inappropiate. ... Not the only one. I often scan whole books and a graphics
              Message 6 of 16 , Dec 19, 2007
              • 0 Attachment
                "Jeff@..." wrote:
                > it depends on the content and the reason for exchanging
                > the document.

                Quite. Everything you describe here are purposes for which PDF is
                particularly inappropiate.

                > The ONLY nice thing about PDF is that it allows printing
                > without hassle

                Not the only one. I often scan whole books and a graphics program is not
                the best way to quickly skim the contents. For that I prefer PDF scaled
                (in the viewer) just big enough to be just legible, and when I've found
                the part I want I go to the raw scan and extract exactly the bit I need.

                One additional hint though:
                For OCR JPG definitely is NOT the format you want to use. For best
                results scan directly to black and white and save as a two colour (i.e.
                black and white, i.e. one bit per pixel) PNG. Just open any JPG with
                scanned text and enlarge the area around a single letter and you'll see,
                why OCRs struggle with it.

                > unfortunately he used password
                > protection, and the docs will not open.

                If you want you may send me one of those files and I'll try to break it.
                Don't expect too much though.

                Axel
              • Jeff Scism
                ... I agree completely, but the average submitter saves everything in JPG, image wise. Sometimes you have to work with what you get. JPEGs are crappy when it
                Message 7 of 16 , Dec 19, 2007
                • 0 Attachment
                  Axel Berger wrote:
                  >
                  > "
                  > One additional hint though:
                  > For OCR JPG definitely is NOT the format you want to use. For best
                  > results scan directly to black and white and save as a two colour (i.e.
                  > black and white, i.e. one bit per pixel) PNG. Just open any JPG with
                  > scanned text and enlarge the area around a single letter and you'll see,
                  > why OCRs struggle with it.
                  >








                  I agree completely, but the average submitter saves everything in JPG,
                  image wise. Sometimes you have to work with what you get. JPEGs are
                  crappy when it comes to pixelation.

                  Jeff
                • Jeff@ibssg.org
                  ... I haven t tried yet, but someday I will run through some of the seemingly related keywords and see if any work. His research was quite controversial
                  Message 8 of 16 , Dec 19, 2007
                  • 0 Attachment
                    Axel Berger wrote:
                    >
                    > If you want you may send me one of those files and I'll try to break it.
                    > Don't expect too much though.
                    >
                    > Axel
                    >
                    I haven't tried yet, but someday I will run through some of the
                    'seemingly related' keywords and see if any work.

                    His research was quite controversial within the family, and he didn't
                    want the parts that were speculation released, so I have back burnered
                    the project.

                    --


                    Jeffery G. Scism, IBSSG
                    ~~

                    Blacksheep Ancestors in your Family?
                    'Blacksheep Genealogy' is a registered California Sole Proprietorship.
                    The International Black Sheep Society of Genealogists is a Social Organization Identified by its members using IBSSG after their signatures.

                    Visit http://ibssg.org/

                    For The Blacksheep website, Montgomery County, Putnam County, and Fountain County USGenWeb sites. MORE...
                  • Brian Binder
                    Yep, PDF is quite handy for handing off information where others don t have the appropriate program either. This is true for Microsoft Project, as an example.
                    Message 9 of 16 , Dec 21, 2007
                    • 0 Attachment
                      Yep, PDF is quite handy for handing off information where others don't
                      have the appropriate program either. This is true for Microsoft
                      Project, as an example. I use that for making projects all the time,
                      yet 95% of my clients don't own it. With a PDF, they can view it all
                      and stay up-to-date on statuses, as necessary.

                      > Another researcher (who is NOW deceased) sent me his entire research
                      > library on PDF, 30-40 years of his work, unfortunately he used password
                      > protection, and the docs will not open. He didn't include the password,
                      > and I didn't discover it until after he had passed away. Perhaps someday
                      > I will be able to crack those files.

                      Not like I'm looking for this to be some circumvention post, but if
                      you'd like I can likely get the files unprotected for you. I say it
                      with more than a good amount of confidence. I'll offer that I've never
                      seen a PDF that I couldn't break the protection on safely. Happens to
                      plenty of my clients after they put a password on and somehow forget it.
                    • Pat Drummond
                      ... days ... from ... several ... I agree. Creating PDFs for organizations or clubs should be banned! What software can pull text out of protected PDFs? I d
                      Message 10 of 16 , Dec 23, 2007
                      • 0 Attachment
                        --- In ntb-OffTopic@yahoogroups.com, Jeff Scism <jeff@...> wrote:
                        > One of the "features" of the portable Document format in the early
                        days
                        > of the format was advertised as "copy protection" preventing people
                        from
                        > copying from your PDF documents. Since that time there have been
                        several
                        > PDF capable programs designed by others which enable access to the
                        > contents of a PDF document.
                        >
                        > In an age where data must not only be communicated, but manipulated,
                        > this design is a built in flaw. That is why people have tried to work
                        > around it.
                        >
                        > In my view, PDFs are a hassle. if you have to exert yourself to be able
                        > to extract from them, it is more like TV than computing.

                        I agree. Creating PDFs for organizations or clubs should be banned!
                        What software can pull text out of 'protected' PDFs? I'd love to be
                        able to do that!
                      • David Smart
                        ... I m probably missing your point here, but ... PDFs are great for organisations with wide and varied membership - e.g. for clubs. It means that a formatted
                        Message 11 of 16 , Dec 23, 2007
                        • 0 Attachment
                          > I agree. Creating PDFs for organizations or clubs should be banned!

                          I'm probably missing your point here, but ...

                          PDFs are great for organisations with wide and varied membership - e.g. for
                          clubs. It means that a formatted document can be distributed with the
                          knowledge that everyone can see it or print it without needing to have your
                          particular brand or level of office software, or needing to buy some special
                          display software. This benefit increases when the people producing separate
                          documents for distribution are, themselves, possibly using different
                          software.

                          PDF should not be used for data storage (and I don't think it was ever
                          intended by its designers that it should). If this is your complaint,
                          then - yes I certainly agree. But not just for clubs, etc. It should never
                          be used to hold data anywhere.

                          Regards, Dave S

                          ----- Original Message -----
                          From: "Pat Drummond" <pat@...>
                          To: <ntb-OffTopic@yahoogroups.com>
                          Sent: Monday, December 24, 2007 6:49 AM
                          Subject: [NTO] Re: [NTB] PDF files


                          > --- In ntb-OffTopic@yahoogroups.com, Jeff Scism <jeff@...> wrote:
                          >> One of the "features" of the portable Document format in the early
                          > days
                          >> of the format was advertised as "copy protection" preventing people
                          > from
                          >> copying from your PDF documents. Since that time there have been
                          > several
                          >> PDF capable programs designed by others which enable access to the
                          >> contents of a PDF document.
                          >>
                          >> In an age where data must not only be communicated, but manipulated,
                          >> this design is a built in flaw. That is why people have tried to work
                          >> around it.
                          >>
                          >> In my view, PDFs are a hassle. if you have to exert yourself to be able
                          >> to extract from them, it is more like TV than computing.
                          >
                          > I agree. Creating PDFs for organizations or clubs should be banned!
                          > What software can pull text out of 'protected' PDFs? I'd love to be
                          > able to do that!
                          >
                          >
                          >
                          >
                          > Yahoo! Groups Links
                          >
                          >
                          >
                        • Jeff Scism
                          Basically PDF are Read Only, and therefor Acrobat (or the Adobe Programs that create PDF Docs) should not be used as a Word Processor. Simply because the
                          Message 12 of 16 , Dec 23, 2007
                          • 0 Attachment
                            Basically PDF are Read Only, and therefor Acrobat (or the Adobe
                            Programs that create PDF Docs) should not be used as a Word Processor.
                            Simply because the ability to edit the documents is limited.

                            This may be great for creating a printable format, or just doing
                            announcements, newsletters, etc. but for people who need to work with
                            the text, a word processing format is much better.

                            I prefer NoteTab. It does most of what i want to do. If I have to, I use
                            OpenOffice.



                            David Smart wrote:
                            >
                            > > I agree. Creating PDFs for organizations or clubs should be banned!
                            >
                            > I'm probably missing your point here, but ...
                            >
                            > PDFs are great for organisations with wide and varied membership -
                            > e.g. for
                            > clubs. It means that a formatted document can be distributed with the
                            > knowledge that everyone can see it or print it without needing to have
                            > your
                            > particular brand or level of office software, or needing to buy some
                            > special
                            > display software. This benefit increases when the people producing
                            > separate
                            > documents for distribution are, themselves, possibly using different
                            > software.
                            >
                            > PDF should not be used for data storage (and I don't think it was ever
                            > intended by its designers that it should). If this is your complaint,
                            > then - yes I certainly agree. But not just for clubs, etc. It should
                            > never
                            > be used to hold data anywhere.
                            >
                            > Regards, Dave S
                            >

                            --


                            Jeffery G. Scism, IBSSG
                            ~~

                            Blacksheep Ancestors in your Family?
                            'Blacksheep Genealogy' is a registered California Sole Proprietorship.
                            The International Black Sheep Society of Genealogists is a Social Organization Identified by its members using IBSSG after their signatures.

                            Visit http://ibssg.org/

                            For The Blacksheep website, Montgomery County, Putnam County, and Fountain County USGenWeb sites. MORE...
                          • Axel Berger
                            ... Exactly, and read-only is what the final distributed version of anything should be. If you don t set silly restrictions, that can be circumvented in any
                            Message 13 of 16 , Dec 23, 2007
                            • 0 Attachment
                              Jeff Scism wrote:
                              > Basically PDF are Read Only, and therefor Acrobat (or the Adobe
                              > Programs that create PDF Docs) should not be used as a Word Processor.

                              Exactly, and read-only is what the final distributed version of anything
                              should be. If you don't set silly restrictions, that can be circumvented
                              in any case, limited extraction of content is possible as graphics or
                              ASCII, but no real editing.

                              It is totally the wrong format for work in progress. That said I have
                              been asked to provide scans bound together as PDF rather than the raw
                              images and having done it I find myself much preferring that for reading
                              or scanning (or printing) pages of stuff, but I keep the raw images for
                              everything else.

                              Axel
                            • David Smart
                              Yes, I think you re saying what I m saying. It s funny, I really like NoteTab, but I don t seem to use it a fraction of what others do. The only time I ever
                              Message 14 of 16 , Dec 23, 2007
                              • 0 Attachment
                                Yes, I think you're saying what I'm saying.

                                It's funny, I really like NoteTab, but I don't seem to use it a fraction of
                                what others do. The only time I ever seem to have text files is when a
                                program requires them (e.g. batch file, configuration files). For almost
                                everything else I do, I seem to need formatting (e.g. Word) or data
                                manipulation (e.g. Access or Excel). Very rarely simple text files.
                                (Actually, I would use Excel at least 5:1 over Word or Access or NoteTab or
                                anything else.)

                                Regards, Dave S

                                ----- Original Message -----
                                From: "Jeff Scism" <jeff@...>
                                To: <ntb-OffTopic@yahoogroups.com>
                                Sent: Monday, December 24, 2007 9:10 AM
                                Subject: Re: [NTO] Re: [NTB] PDF files


                                > Basically PDF are Read Only, and therefor Acrobat (or the Adobe
                                > Programs that create PDF Docs) should not be used as a Word Processor.
                                > Simply because the ability to edit the documents is limited.
                                >
                                > This may be great for creating a printable format, or just doing
                                > announcements, newsletters, etc. but for people who need to work with
                                > the text, a word processing format is much better.
                                >
                                > I prefer NoteTab. It does most of what i want to do. If I have to, I use
                                > OpenOffice.
                                >
                                >
                                >
                                > David Smart wrote:
                                >>
                                >> > I agree. Creating PDFs for organizations or clubs should be banned!
                                >>
                                >> I'm probably missing your point here, but ...
                                >>
                                >> PDFs are great for organisations with wide and varied membership -
                                >> e.g. for
                                >> clubs. It means that a formatted document can be distributed with the
                                >> knowledge that everyone can see it or print it without needing to have
                                >> your
                                >> particular brand or level of office software, or needing to buy some
                                >> special
                                >> display software. This benefit increases when the people producing
                                >> separate
                                >> documents for distribution are, themselves, possibly using different
                                >> software.
                                >>
                                >> PDF should not be used for data storage (and I don't think it was ever
                                >> intended by its designers that it should). If this is your complaint,
                                >> then - yes I certainly agree. But not just for clubs, etc. It should
                                >> never
                                >> be used to hold data anywhere.
                                >>
                                >> Regards, Dave S
                                >>
                                >
                                > --
                                >
                                >
                                > Jeffery G. Scism, IBSSG
                                > ~~
                                >
                                > Blacksheep Ancestors in your Family?
                                > 'Blacksheep Genealogy' is a registered California Sole Proprietorship.
                                > The International Black Sheep Society of Genealogists is a Social
                                > Organization Identified by its members using IBSSG after their
                                > signatures.
                                >
                                > Visit http://ibssg.org/
                                >
                                > For The Blacksheep website, Montgomery County, Putnam County, and Fountain
                                > County USGenWeb sites. MORE...
                                >
                                >
                                >
                                >
                                > Yahoo! Groups Links
                                >
                                >
                                >
                              • Don - HtmlFixIt.com
                                ... I use notetab over excel all day, every day, for data manipulation. It sorts, combines, strips, etc. etc.
                                Message 15 of 16 , Dec 23, 2007
                                • 0 Attachment
                                  David Smart wrote:
                                  > Yes, I think you're saying what I'm saying.
                                  >
                                  > It's funny, I really like NoteTab, but I don't seem to use it a fraction of
                                  > what others do. The only time I ever seem to have text files is when a
                                  > program requires them (e.g. batch file, configuration files). For almost
                                  > everything else I do, I seem to need formatting (e.g. Word) or data
                                  > manipulation (e.g. Access or Excel). Very rarely simple text files.
                                  > (Actually, I would use Excel at least 5:1 over Word or Access or NoteTab or
                                  > anything else.)
                                  >
                                  > Regards, Dave S
                                  I use notetab over excel all day, every day, for data manipulation. It
                                  sorts, combines, strips, etc. etc.
                                • buralex@gmail.com
                                  Don - HtmlFixIt.com said on Dec 23, 2007 22:07 ... Dave, Don (just got back from visiting relatives over Christmas so I m a couple of
                                  Message 16 of 16 , Dec 27, 2007
                                  • 0 Attachment
                                    "Don - HtmlFixIt.com" <don@...> said on Dec 23, 2007 22:07
                                    -0500 (in part):
                                    > > (Actually, I would use Excel at least 5:1 over Word or Access or
                                    > NoteTab or
                                    > > anything else.)
                                    > >
                                    > > Regards, Dave S
                                    > I use notetab over excel all day, every day, for data manipulation. It
                                    > sorts, combines, strips, etc. etc.
                                    Dave, Don (just got back from visiting relatives over Christmas so I'm a
                                    couple of days behind)

                                    Notetab vs. Excel vs. Notetab - my frequent usage patterns is actually
                                    back-and-forth from Excel to Notetab.
                                    Copy a selection grid from Excel, paste into Notetab, manipulate with
                                    Regexp and/or clips, copy-ALL and paste back to Excel (sometimes several
                                    round trips to accomplish various sub-tasks)

                                    Very infrequently I dip into a little Basic from within Excel

                                    The two programs are (IMO) a perfect match!

                                    Regards ... Alec -- buralex-gmail
                                    --



                                    [Non-text portions of this message have been removed]
                                  Your message has been successfully submitted and would be delivered to recipients shortly.