Loading ...
Sorry, an error occurred while loading the content.

FineReader 11, etc

Expand Messages
  • Nick Hodson
    Hi! I suppose this group still exists, though I haven t heard anything from anyone in it for quite some time. Just to say that I have been using the new
    Message 1 of 7 , Mar 16 1:21 PM
    • 0 Attachment
      Hi!
      I suppose this group still exists, though I haven't heard anything from anyone in it for quite some time.

      Just to say that I have been using the new FineReader 11 for some months now, and find it easy to use, very fast, and very accurate. Obey their instructions for the best results.

      A year ago a member of this group, Terry Smythe, alerted us to the fact that Ion was developing a gadget that would help you scan books. It had been displayed as a working model at the Las Vegas show in January 2011. It was supposed to be available on the market in June that year. But the production date seemed to slip, and now I can't see anything on Ion about it, though I have seen a copy of an email they sent to someone who wrote in to them, saying that they were not able to continue with development of this product.

      I have now scanned nearly 700 nineteenth century or early twentieth century books and had them posted on Project Gutenberg, where they are freely available to anyone on the world. Most of these were scanned using a Plustek OpticBook 3600 scanner, and the majority were processed with the aid of ABBYY FineReader, though not, alas, version 11. I also use Scan Tailor to help me with this work.

      Regards to all, Nick Hodson


      [Non-text portions of this message have been removed]
    • John Laurie
      Hi Nick I ve started using FR 11 too and it s very accurate. It does pose some extra problems for TEI conversion, like the way it handles footnotes. Also I
      Message 2 of 7 , Mar 18 4:31 PM
      • 0 Attachment
        Hi Nick

        I've started using FR 11 too and it's very accurate. It does pose some extra problems for TEI conversion, like the way it handles footnotes. Also I find I have to stop it from leaving out running titles and page numbers if I allow automatic page analysis.

        If you think the Plustek OpticBook 3600 is good you should get the Plustek A300 OpticBook which is about 4 times as fast as well as being A3.

        John

        ********************************************
        John Laurie
        Digital Initiatives Librarian
        Digital Services
        Level 3, General Library
        University of Auckland
        Phone (09)3737599 x 85773
        Email j.laurie@...
        *************************************************


        -----Original Message-----
        From: digital-text@yahoogroups.com [mailto:digital-text@yahoogroups.com] On Behalf Of Nick Hodson
        Sent: Saturday, 17 March 2012 9:21 a.m.
        To: digital-text@yahoogroups.com
        Subject: [digital-text] FineReader 11, etc

        Hi!
        I suppose this group still exists, though I haven't heard anything from anyone in it for quite some time.

        Just to say that I have been using the new FineReader 11 for some months now, and find it easy to use, very fast, and very accurate. Obey their instructions for the best results.

        A year ago a member of this group, Terry Smythe, alerted us to the fact that Ion was developing a gadget that would help you scan books. It had been displayed as a working model at the Las Vegas show in January 2011. It was supposed to be available on the market in June that year. But the production date seemed to slip, and now I can't see anything on Ion about it, though I have seen a copy of an email they sent to someone who wrote in to them, saying that they were not able to continue with development of this product.

        I have now scanned nearly 700 nineteenth century or early twentieth century books and had them posted on Project Gutenberg, where they are freely available to anyone on the world. Most of these were scanned using a Plustek OpticBook 3600 scanner, and the majority were processed with the aid of ABBYY FineReader, though not, alas, version 11. I also use Scan Tailor to help me with this work.

        Regards to all, Nick Hodson


        [Non-text portions of this message have been removed]



        ------------------------------------

        Yahoo! Groups Links
      • Nick Hodson
        Hi, John You are exactly the person I was wondering about, because when FR 10 came out you were rather disparaging about it. I am glad you are finding FR 11 to
        Message 3 of 7 , Mar 18 11:50 PM
        • 0 Attachment
          Hi, John
          You are exactly the person I was wondering about, because when FR 10 came out you were rather disparaging about it. I am glad you are finding FR 11 to be OK. I think you can permanently set what you want to do with headers and footers by using Tools/Options and then the Save tab.

          I know the larger OpticBook is very good, but I have no need for such a large one. Its higher speed would be useful but in any case most of the time in scanning a book is taken up by turning the page and positioning the book. I used to scan at 600 dpi, but then found that FR 10 and 11 are better at 300 dpi, so that's where I have now got the speed from. I use Scan Tailor to tidy up the pages before doing OCR on them. When studying a book I make a DjVu of it from the scans, before I do the OCR. I have developed a long series of processes that runs automatically after the OCR, and that ends up with an almost correct version of the book. I produce four versions of the book for my own use, as well as the two versions required by Project Gutenberg. My four versions are (one) the DjVu; (two) the validated FB2; (three) the validated epub; (four) the M4B, as played on an Ipod or on Apple QuickTime. The two PG versions are "Plain Vanilla Ascii" and a validated
          xhtml with images.

          Good to hear from you.
          Kind regards, Nick Hodson, London, England, UK.



          ________________________________
          From: John Laurie <j.laurie@...>
          To: "digital-text@yahoogroups.com" <digital-text@yahoogroups.com>
          Sent: Sunday, 18 March 2012, 23:31
          Subject: RE: [digital-text] FineReader 11, etc


           
          Hi Nick

          I've started using FR 11 too and it's very accurate. It does pose some extra problems for TEI conversion, like the way it handles footnotes. Also I find I have to stop it from leaving out running titles and page numbers if I allow automatic page analysis.

          If you think the Plustek OpticBook 3600 is good you should get the Plustek A300 OpticBook which is about 4 times as fast as well as being A3.

          John

          ********************************************
          John Laurie
          Digital Initiatives Librarian
          Digital Services
          Level 3, General Library
          University of Auckland
          Phone (09)3737599 x 85773
          Email j.laurie@...
          *************************************************

          -----Original Message-----
          From: digital-text@yahoogroups.com [mailto:digital-text@yahoogroups.com] On Behalf Of Nick Hodson
          Sent: Saturday, 17 March 2012 9:21 a.m.
          To: digital-text@yahoogroups.com
          Subject: [digital-text] FineReader 11, etc

          Hi!
          I suppose this group still exists, though I haven't heard anything from anyone in it for quite some time.

          Just to say that I have been using the new FineReader 11 for some months now, and find it easy to use, very fast, and very accurate. Obey their instructions for the best results.

          A year ago a member of this group, Terry Smythe, alerted us to the fact that Ion was developing a gadget that would help you scan books. It had been displayed as a working model at the Las Vegas show in January 2011. It was supposed to be available on the market in June that year. But the production date seemed to slip, and now I can't see anything on Ion about it, though I have seen a copy of an email they sent to someone who wrote in to them, saying that they were not able to continue with development of this product.

          I have now scanned nearly 700 nineteenth century or early twentieth century books and had them posted on Project Gutenberg, where they are freely available to anyone on the world. Most of these were scanned using a Plustek OpticBook 3600 scanner, and the majority were processed with the aid of ABBYY FineReader, though not, alas, version 11. I also use Scan Tailor to help me with this work.

          Regards to all, Nick Hodson

          [Non-text portions of this message have been removed]

          ------------------------------------

          Yahoo! Groups Links




          [Non-text portions of this message have been removed]
        • John Laurie
          Hi Nick The accuracy is good with FR11 but the problems with HTML output for TEI are actually worse than with FR10 and I have to use various workarounds with
          Message 4 of 7 , Mar 19 1:27 PM
          • 0 Attachment
            Hi Nick

            The accuracy is good with FR11 but the problems with HTML output for TEI are actually worse than with FR10 and I have to use various workarounds with regular expressions to save italic fonts, as well as manually create an area template to keep all text which FR has decided is headers and footers. I switched from 300dpi to 400dpi some years ago because it gave better results with very small fonts.

            Are you getting all the old page numbers and running headings on your DjVu, FB2 M4B and epub outputs? I think it's essential for scholars to be able to cite page numbers.

            I can output 500+ pages an hour with Plustek A300. I think the actual scan takes longer than turning the book around.

            I'm keeping TIFFs of the original page scans and I preferred the way FR9 and earlier saved them for me in one folder. I also create PDF derivatives. We are waiting for an our developer to set up automatic ePub ouput from our TEIs.


            Will be looking at your Scan Tailor and am interested in your other automatic processes.

            John Laurie

            ********************************************
            John Laurie
            Digital Initiatives Librarian
            Digital Services
            Level 3, General Library
            University of Auckland
            Phone (09)3737599 x 85773
            Email j.laurie@...
            *************************************************

            -----Original Message-----
            From: digital-text@yahoogroups.com [mailto:digital-text@yahoogroups.com] On Behalf Of Nick Hodson
            Sent: Monday, 19 March 2012 7:51 p.m.
            To: digital-text@yahoogroups.com
            Subject: Re: [digital-text] FineReader 11, etc

            Hi, John
            You are exactly the person I was wondering about, because when FR 10 came out you were rather disparaging about it. I am glad you are finding FR 11 to be OK. I think you can permanently set what you want to do with headers and footers by using Tools/Options and then the Save tab.

            I know the larger OpticBook is very good, but I have no need for such a large one. Its higher speed would be useful but in any case most of the time in scanning a book is taken up by turning the page and positioning the book. I used to scan at 600 dpi, but then found that FR 10 and 11 are better at 300 dpi, so that's where I have now got the speed from. I use Scan Tailor to tidy up the pages before doing OCR on them. When studying a book I make a DjVu of it from the scans, before I do the OCR. I have developed a long series of processes that runs automatically after the OCR, and that ends up with an almost correct version of the book. I produce four versions of the book for my own use, as well as the two versions required by Project Gutenberg. My four versions are (one) the DjVu; (two) the validated FB2; (three) the validated epub; (four) the M4B, as played on an Ipod or on Apple QuickTime. The two PG versions are "Plain Vanilla Ascii" and a validated
            xhtml with images.

            Good to hear from you.
            Kind regards, Nick Hodson, London, England, UK.



            ________________________________
            From: John Laurie <j.laurie@...>
            To: "digital-text@yahoogroups.com" <digital-text@yahoogroups.com>
            Sent: Sunday, 18 March 2012, 23:31
            Subject: RE: [digital-text] FineReader 11, etc


             
            Hi Nick

            I've started using FR 11 too and it's very accurate. It does pose some extra problems for TEI conversion, like the way it handles footnotes. Also I find I have to stop it from leaving out running titles and page numbers if I allow automatic page analysis.

            If you think the Plustek OpticBook 3600 is good you should get the Plustek A300 OpticBook which is about 4 times as fast as well as being A3.

            John

            ********************************************
            John Laurie
            Digital Initiatives Librarian
            Digital Services
            Level 3, General Library
            University of Auckland
            Phone (09)3737599 x 85773
            Email j.laurie@...
            *************************************************

            -----Original Message-----
            From: digital-text@yahoogroups.com [mailto:digital-text@yahoogroups.com] On Behalf Of Nick Hodson
            Sent: Saturday, 17 March 2012 9:21 a.m.
            To: digital-text@yahoogroups.com
            Subject: [digital-text] FineReader 11, etc

            Hi!
            I suppose this group still exists, though I haven't heard anything from anyone in it for quite some time.

            Just to say that I have been using the new FineReader 11 for some months now, and find it easy to use, very fast, and very accurate. Obey their instructions for the best results.

            A year ago a member of this group, Terry Smythe, alerted us to the fact that Ion was developing a gadget that would help you scan books. It had been displayed as a working model at the Las Vegas show in January 2011. It was supposed to be available on the market in June that year. But the production date seemed to slip, and now I can't see anything on Ion about it, though I have seen a copy of an email they sent to someone who wrote in to them, saying that they were not able to continue with development of this product.

            I have now scanned nearly 700 nineteenth century or early twentieth century books and had them posted on Project Gutenberg, where they are freely available to anyone on the world. Most of these were scanned using a Plustek OpticBook 3600 scanner, and the majority were processed with the aid of ABBYY FineReader, though not, alas, version 11. I also use Scan Tailor to help me with this work.

            Regards to all, Nick Hodson

            [Non-text portions of this message have been removed]

            ------------------------------------

            Yahoo! Groups Links




            [Non-text portions of this message have been removed]



            ------------------------------------

            Yahoo! Groups Links
          • Lars Aronsson
            ... Members of the DIY Bookscanner group have noted that Ion closed down this product. Their forums are quite good,
            Message 5 of 7 , Mar 19 3:57 PM
            • 0 Attachment
              On 03/16/2012 09:21 PM, Nick Hodson wrote:
              > A year ago a member of this group, Terry Smythe, alerted us to the fact that Ion was developing a gadget that would help you scan books. It had been displayed as a working model at the Las Vegas show in January 2011. It was supposed to be available on the market in June that year. But the production date seemed to slip, and now I can't see anything on Ion about it,

              Members of the DIY Bookscanner group have noted that
              Ion closed down this product. Their forums are quite good,
              http://www.diybookscanner.org/forum/index.php

              Here's the thread on the Ion Audio Booksaver,
              http://www.diybookscanner.org/forum/viewtopic.php?f=16&t=826

              The last message seems to be a European announcement that
              the Booksaver will be made available after all, in June 2012.


              --
              Lars Aronsson (lars@...)
              Project Runeberg - free Nordic literature - http://runeberg.org/
            • Nick Hodson
              Thank you, Lars, that is very helpful. The last message is in Dutch, and it is the only one on page 7. The last message on page 6 appears to be a translation
              Message 6 of 7 , Mar 20 2:18 AM
              • 0 Attachment
                Thank you, Lars, that is very helpful. The last message is in Dutch, and it is the only one on page 7. The last message on page 6 appears to be a translation of the Dutch one into English. I get the impression that the delay has been due to the software, rather than to the design of the device and its cameras. But really, most of us would be interested in getting good images of the pages of a book, and then doing the processing with the software which we already have.

                So we will have to wait till June 2012, which isn't so far away now.


                Recently I bought a wand scanner, which is quite fast in use. It takes a bit of getting used to, but I have done one complete long book with it. A few pages had to be rescanned, far fewer than I had at first expected. Mine isn't from Ion, but they do one, if anyone is interested.

                Best regards, Nick Hodson, London, England, UK.



                ________________________________
                From: Lars Aronsson <lars@...>
                To: digital-text@yahoogroups.com
                Sent: Monday, 19 March 2012, 22:57
                Subject: Re: [digital-text] FineReader 11, etc


                 
                On 03/16/2012 09:21 PM, Nick Hodson wrote:
                > A year ago a member of this group, Terry Smythe, alerted us to the fact that Ion was developing a gadget that would help you scan books. It had been displayed as a working model at the Las Vegas show in January 2011. It was supposed to be available on the market in June that year. But the production date seemed to slip, and now I can't see anything on Ion about it,

                Members of the DIY Bookscanner group have noted that
                Ion closed down this product. Their forums are quite good,
                http://www.diybookscanner.org/forum/index.php

                Here's the thread on the Ion Audio Booksaver,
                http://www.diybookscanner.org/forum/viewtopic.php?f=16&t=826

                The last message seems to be a European announcement that
                the Booksaver will be made available after all, in June 2012.

                --
                Lars Aronsson (lars@...)
                Project Runeberg - free Nordic literature - http://runeberg.org/




                [Non-text portions of this message have been removed]
              • Nick Hodson
                John I have the page numbers of each page in the version I use for editing. There are over 80 tests on the whole book, as well as a process that produces
                Message 7 of 7 , Jun 1, 2012
                • 0 Attachment
                  John
                  I have the page numbers of each page in the version I use for editing. There are over 80 tests on the whole book, as well as a process that produces well-formed chapters. Any possible error that is reported is linked to chapter number, page number and paragraph number. This makes it easy to find the original in the DjVu that I make of the book before doing the OCR. I used to make a pdf of the book and work from that, but a DjVu is about seven times smaller. I publish through Project Gutenberg. This means that my work is done for the benefit of children all over the world, and not for scholars. If any of these scholars is interested, and I can't imaginewhy they should be, I can send them a version with page numbers very visible. Obviously the DjVu has page numbers and headers visible, or at least as visible as they ever were in the original book. Obviously, also, M4B files do not have page numbers or page headers. because they are the book read out in
                  English, in the format that is used by Apple. I read yesterday that there is a program that tweaks m4b files so that Kindle thinks they are Audible ones. In any case they are both varieties of aac files.


                  Making epub files is easy enough. You first write the software that creates an HTML file of the book. You would use H2 for book and author titles, H3 for Chapters (as in Chapter XVIII), and H4 for chapter sub-headings. All the necessary metadata also appears in the head of that html file, as also the css file that you are using. You get Sigil to read this, and just check that the Table of Contents it generates is OK, and also its metadata (hit F8). There is also a stage available in Sigil in which the file is tested thoroughly. If all that is clean, which it should be, you can save the epub file under whatever you consider to be a suitable name.

                  I always like to make fb2 files first, because the Haali Reader screams if there is anything wrong. Some of the other fb2 readers, such as CoolReader 3 just try to bash on, correcting the error if they can. It is better to have no error. Haali Reader tells you exactly where the error is, and once you know that, you can go exactly to that line and column, and the error will be obvious.

                  Must go.

                  Kind regards, Nick



                  ________________________________
                  From: John Laurie <j.laurie@...>
                  To: "digital-text@yahoogroups.com" <digital-text@yahoogroups.com>
                  Sent: Monday, 19 March 2012, 20:27
                  Subject: RE: [digital-text] FineReader 11, etc


                   
                  Hi Nick

                  The accuracy is good with FR11 but the problems with HTML output for TEI are actually worse than with FR10 and I have to use various workarounds with regular expressions to save italic fonts, as well as manually create an area template to keep all text which FR has decided is headers and footers. I switched from 300dpi to 400dpi some years ago because it gave better results with very small fonts.

                  Are you getting all the old page numbers and running headings on your DjVu, FB2 M4B and epub outputs? I think it's essential for scholars to be able to cite page numbers.

                  I can output 500+ pages an hour with Plustek A300. I think the actual scan takes longer than turning the book around.

                  I'm keeping TIFFs of the original page scans and I preferred the way FR9 and earlier saved them for me in one folder. I also create PDF derivatives. We are waiting for an our developer to set up automatic ePub ouput from our TEIs.


                  Will be looking at your Scan Tailor and am interested in your other automatic processes.

                  John Laurie

                  ********************************************
                  John Laurie
                  Digital Initiatives Librarian
                  Digital Services
                  Level 3, General Library
                  University of Auckland
                  Phone (09)3737599 x 85773
                  Email j.laurie@...
                  *************************************************

                  -----Original Message-----
                  From: digital-text@yahoogroups.com [mailto:digital-text@yahoogroups.com] On Behalf Of Nick Hodson
                  Sent: Monday, 19 March 2012 7:51 p.m.
                  To: digital-text@yahoogroups.com
                  Subject: Re: [digital-text] FineReader 11, etc

                  Hi, John
                  You are exactly the person I was wondering about, because when FR 10 came out you were rather disparaging about it. I am glad you are finding FR 11 to be OK. I think you can permanently set what you want to do with headers and footers by using Tools/Options and then the Save tab.

                  I know the larger OpticBook is very good, but I have no need for such a large one. Its higher speed would be useful but in any case most of the time in scanning a book is taken up by turning the page and positioning the book. I used to scan at 600 dpi, but then found that FR 10 and 11 are better at 300 dpi, so that's where I have now got the speed from. I use Scan Tailor to tidy up the pages before doing OCR on them. When studying a book I make a DjVu of it from the scans, before I do the OCR. I have developed a long series of processes that runs automatically after the OCR, and that ends up with an almost correct version of the book. I produce four versions of the book for my own use, as well as the two versions required by Project Gutenberg. My four versions are (one) the DjVu; (two) the validated FB2; (three) the validated epub; (four) the M4B, as played on an Ipod or on Apple QuickTime. The two PG versions are "Plain Vanilla Ascii" and a validated
                  xhtml with images.

                  Good to hear from you.
                  Kind regards, Nick Hodson, London, England, UK.



                  ________________________________
                  From: John Laurie <j.laurie@...>
                  To: "digital-text@yahoogroups.com" <digital-text@yahoogroups.com>
                  Sent: Sunday, 18 March 2012, 23:31
                  Subject: RE: [digital-text] FineReader 11, etc


                   
                  Hi Nick

                  I've started using FR 11 too and it's very accurate. It does pose some extra problems for TEI conversion, like the way it handles footnotes. Also I find I have to stop it from leaving out running titles and page numbers if I allow automatic page analysis.

                  If you think the Plustek OpticBook 3600 is good you should get the Plustek A300 OpticBook which is about 4 times as fast as well as being A3.

                  John

                  ********************************************
                  John Laurie
                  Digital Initiatives Librarian
                  Digital Services
                  Level 3, General Library
                  University of Auckland
                  Phone (09)3737599 x 85773
                  Email j.laurie@...
                  *************************************************

                  -----Original Message-----
                  From: digital-text@yahoogroups.com [mailto:digital-text@yahoogroups.com] On Behalf Of Nick Hodson
                  Sent: Saturday, 17 March 2012 9:21 a.m.
                  To: digital-text@yahoogroups.com
                  Subject: [digital-text] FineReader 11, etc

                  Hi!
                  I suppose this group still exists, though I haven't heard anything from anyone in it for quite some time.

                  Just to say that I have been using the new FineReader 11 for some months now, and find it easy to use, very fast, and very accurate. Obey their instructions for the best results.

                  A year ago a member of this group, Terry Smythe, alerted us to the fact that Ion was developing a gadget that would help you scan books. It had been displayed as a working model at the Las Vegas show in January 2011. It was supposed to be available on the market in June that year. But the production date seemed to slip, and now I can't see anything on Ion about it, though I have seen a copy of an email they sent to someone who wrote in to them, saying that they were not able to continue with development of this product.

                  I have now scanned nearly 700 nineteenth century or early twentieth century books and had them posted on Project Gutenberg, where they are freely available to anyone on the world. Most of these were scanned using a Plustek OpticBook 3600 scanner, and the majority were processed with the aid of ABBYY FineReader, though not, alas, version 11. I also use Scan Tailor to help me with this work.

                  Regards to all, Nick Hodson

                  [Non-text portions of this message have been removed]

                  ------------------------------------

                  Yahoo! Groups Links




                  [Non-text portions of this message have been removed]



                  ------------------------------------

                  Yahoo! Groups Links






                  [Non-text portions of this message have been removed]
                Your message has been successfully submitted and would be delivered to recipients shortly.