Loading ...
Sorry, an error occurred while loading the content.

Re: [PBML] OCR scanned documents

Expand Messages
  • Tara Dirst
    I am the technology coordinator for a large digitization project. Our most common errors have been: a) numbers showing up as letters (1 as l; 0 as O) and vice
    Message 1 of 1 , Jan 28, 2002
    • 0 Attachment
      I am the technology coordinator for a large digitization project. Our
      most common errors have been:

      a) numbers showing up as letters (1 as l; 0 as O) and vice versa
      b) the letter h showing up as li (so the become tlie)
      c) if there are stray marks on the pages being digitized, often extra
      punctuation is found during OCR (periods, commas where there are none in
      the original text)

      Tara

      >>> sumit_827@... 01/28/02 02:33PM >>>
      Hi,

      Has anyone worked with OCR (optical character reader) scanned
      documents. I have a project in which I have to deal
      with them and I was looking for a list of most common OCR mistakes or
      something like that.

      Thanks.
      Sumit.


      [Non-text portions of this message have been removed]


      ------------------------ Yahoo! Groups Sponsor

      Unsubscribing info is here:
      http://help.yahoo.com/help/us/groups/groups-32.html

      Your use of Yahoo! Groups is subject to
      http://docs.yahoo.com/info/terms/
    Your message has been successfully submitted and would be delivered to recipients shortly.