Re: [NTO] scanning format
- Adrien Verlee wrote:
> The best scanning format of the future (for OCR processing)?PNG is the best supported lossless(!) graphics format and the one I
always scan to and sometimes convert later. For pure black and white and
if reduction to 256 colours is possible the files are also smaller than
high quality Jpeg. If you should need full colour depth and file size is
an issue (isn't it always?), then Jpeg at a quality of 95 to 100 % (as
used in Irfanview, these values are not standardized across programs)
yields negligible loss at significantly smaller files.
- john041650 wrote:
> TIFF, it's a well established lossless standardAgreed, but it is totally uncompressed which means huge. You can ZIP
Tiffs to about 2 %, i.e. fifty times smaller, but I prefer a lossless
and compressed graphics format like PNG.
N.B: Adrian: PDF is not a raster graphics format but only a wrapper
around it. Most PDF generators will internally use bad and low quality
Jpeg, so that one is decidedly off.
- Op 15/03/2011 3:39, Axel Berger schreef:
> PNG is the best supported lossless(!) graphics format and the one IIn fact, the problem as to which format the OCR software can import
> always scan to and sometimes convert later. For pure black and white and
within a decade. Crystal ball!
If you check most documents you have OCR'ed they will be TIF ,the images are
usually the way the owner want them saved in B/W or COLOUR.
----- Original Message -----
From: "Adrien Verlee" <adrien.verlee@...>
Sent: Thursday, March 17, 2011 9:16 PM
Subject: Re: [NTO] scanning format
> Op 15/03/2011 3:39, Axel Berger schreef:
>> PNG is the best supported lossless(!) graphics format and the one I
>> always scan to and sometimes convert later. For pure black and white and
> In fact, the problem as to which format the OCR software can import
> within a decade. Crystal ball!
> Yahoo! Groups Links
- Adrien Verlee wrote:
> Crystal ball!Anything that is massive now won't go away too soon, as too many
customers would be miffed. Look at GIF or IE6. I'd agree that Jpeg will
possibly always be better supported than PNG but the difference is not
enough to forego the advantages of the latter.
- Dave wrote:
> If you check most documents you have OCR'ed they will be TIF,Dave, I'm not sure what you are saying here, but I guess you're talking
> the images are usually the way the owner want them saved in
> B/W or COLOUR.
about graphics extracted from PDFs, where the OCRed text is behind the
page image. If so I disagree. It is true that the extraction, I use
pdfimages.exe from the XPDF package, yields uncompressed Tiffs. The
reason seems to be simply that the act of reading (and displaying)
includes the decompression already and the extractor makes do without
the extra complication of a compression step (rightly so, one task, one
But in all the PDFs I generate myself I notice that there is no recoding
but all my graphic material is included as is. Whatever I include, Jpeg
in all kinds of quality, PNG in B/W, gray, full colour, 256 colours, 16
colours (great for diagrams and extremely small, but makes PDF
exceedingly slow, so I don't use it), the size of the PDF follows
exactly the sum of the sizes of embedded graphics.
So my guess is you are a victim of a misconception here, but I may have
misunderstood or be wrong.