Loading ...
Sorry, an error occurred while loading the content.

Proofing OCR text => What is an efficient/tolerable method?

Expand Messages
  • Mike Breiding
    Greetings, I have thousands of pages of OCR text to proof read. With just a few pages I would normally do this by checking the original against the OCR
    Message 1 of 3 , Apr 23, 2008
    • 0 Attachment
      Greetings,
      I have thousands of pages of OCR text to proof read.

      With just a few pages I would normally do this by checking the original
      against the OCR version.
      But, it is going to cause me to go buggy doing this with all of the docs
      I have scanned.

      I was wondering about having someone read the origianal while I followed
      with the scanned version.
      This seems like it might be less strain, but it will take two people.

      Any ideas or thoughs on this?

      Thanks,
      -Mike
    • Jeff Scism
      OCR programs differ in their accuracy. You want to start with the most accurate OCR engine you can get, that means less correcting of the results. There are
      Message 2 of 3 , Apr 23, 2008
      • 0 Attachment
        OCR programs differ in their accuracy.

        You want to start with the most accurate OCR engine you can get, that
        means less correcting of the results.

        There are several sites which compare performance on OCR engines.

        Try to find one that "learns" as you correct.

        I know of no way to speed it up, but Pagis has a good program that will
        walk you through the scan and the text at the same time.

        I have had trouble using Pagis on Vista, so I also have ABBYY fine
        reader and Simple OCR installed, "Simple" is perhaps the least accurate
        I have seen yet.

        There are some really accurate OCR engines, but they have high prices
        attached.

        Jeff

        Mike Breiding wrote:
        >
        >
        > Greetings,
        > I have thousands of pages of OCR text to proof read.
        >
        > With just a few pages I would normally do this by checking the original
        > against the OCR version.
        > But, it is going to cause me to go buggy doing this with all of the docs
        > I have scanned.
        >
        > I was wondering about having someone read the origianal while I followed
        > with the scanned version.
        > This seems like it might be less strain, but it will take two people.
        >
        > Any ideas or thoughs on this?
        >
        > Thanks,
        > -Mike
        >
        >


        --


        Jeffery G. Scism, IBSSG
        ~~

        "Proponents of each side are vying with determination to prove their ignorance is greater than the other."

        President Andrew Jackson, discussing a bill going through the US Congress.



        Visit http://ibssg.org/
        For The Blacksheep website, MORE...

        Putnam County Indiana Biographies and Obituaries
        http://ingenweb.org/inputnam/bios/

        Montgomery County Indiana Biographies and Obituaries
        http://ingenweb.org/inmontgomery/bios/

        Fountain County Indiana Biographies and Obituaries
        http://ingenweb.org/infountain/vitals/bios/
      • fw7oaks
        Stuff the OCR ed text into a Text2Voice program ? There s a freeware program BlablaMaker here http://www.neuesvon.de/blabla-maker/index.html Despite the German
        Message 3 of 3 , Apr 23, 2008
        • 0 Attachment
          Stuff the OCR'ed text into a Text2Voice program ?

          There's a freeware program BlablaMaker here

          http://www.neuesvon.de/blabla-maker/index.html

          Despite the German interface it does English OK.

          fw

          --- On Wed, 4/23/08, Mike Breiding <mike@...> wrote:
          From: Mike Breiding <mike@...>
          Subject: [NTO] Proofing OCR text => What is an efficient/tolerable method?
          To: ntb-OffTopic@yahoogroups.com
          Date: Wednesday, April 23, 2008, 5:49 PM













          Greetings,

          I have thousands of pages of OCR text to proof read.



          With just a few pages I would normally do this by checking the original

          against the OCR version.

          But, it is going to cause me to go buggy doing this with all of the docs

          I have scanned.



          I was wondering about having someone read the origianal while I followed

          with the scanned version.

          This seems like it might be less strain, but it will take two people.



          Any ideas or thoughs on this?



          Thanks,

          -Mike
        Your message has been successfully submitted and would be delivered to recipients shortly.