Loading ...
Sorry, an error occurred while loading the content.

Using SkanKromSator

Expand Messages
  • Nick Hodson
    One of the most useful things that has come out of the recent thread about Best and cheapest book scanning option is Marvin D. Hernández suggestion that we
    Message 1 of 9 , Aug 1, 2008
    • 0 Attachment
      One of the most useful things that has come out of the recent thread about "Best and cheapest book scanning option" is Marvin D. Hernández' suggestion that we should try SkanKromSator, for straightening, despeckling and otherwise formatting texts. It seems to do a wonderful job, but its Help File appears to have been very badly automatically translated from the Russian, rendering it useless. I have had some success in using the program, so I have written up a piece about how to use it, and posted it at
      http://www.athelstane.co.uk/kromsate.htm

      I think I have done five or six books using it, and am very pleased.

      On the subject of how straightening ought to be done, which was followed up here by a note from Jon Noring, I should say that some years ago I wrote a "proper" straightener, just in the way that Jon indicates. Having decided how many degrees, or fractions of degrees, to do the rotation with, you need to decide on the mid point of the file containing the text to be straightened. Every black pixel is then moved to a white target file, using a displacement calculated using sines and cosines. That is in fact the only really valid way to do the straightening. But there is one thing that people often forget, and that is that doing this introduces a population of speckles. This is because the transformations performed on each pixel may result in two pixels landing at the same place on the output file, with the result that there may be a white speck in the middle of black text. Or, of course a black speck just in the white background. That is why despeckling and possibly anti-aliasing is quite an essential part of straightening, removing not just black pixels in the midst of white ones, but white pixels in the midst of black ones.

      And then there is the business of deciding what the angle of rotation is.

      Somehow SkanKromSator seems to have got all these elements of the job quite perfect.

      I am sorry that Jon has not time to check SkanKromSator out, because his criticisms would have been useful. What I had asked for his input upon was that there are half a dozen options for the algorithm to do the straightening. As it takes five minutes for the program to process automatically through a normal-sized book, you might need ten minutes to check out each option, say an hour in total. Maybe with the assistance I have given by writing the above link, someone else could have a look at this issue. All I can say is that the results I am obtaining using this rapid and free program appear indistinguishable from the best I can get using any other method known to me, and it is a real pleasure working with the results.

      I shall be away from tomorrow for a few days, in Scotland, on an island 100 miles from the mainland, so will not easily be able to deal with correspondence for a week, but we do have broadband on the island, so I will be able to get emails if I can just get near a wi-fi source.

      With kind regards to all scanners and book-proofers.

      Nick Hodson
      London, England, United Kingdom


      [Non-text portions of this message have been removed]
    • Jon Noring
      ... Thanks for the tutorial . I hope sometime soon to try it out, especially to determine the algorithms it uses for deskewing and if it works for
      Message 2 of 9 , Aug 1, 2008
      • 0 Attachment
        Nick Hodson wrote:

        > One of the most useful things [snip] is Marvin D. Hernández'
        > suggestion that we should try SkanKromSator, for straightening,
        > despeckling and otherwise formatting texts. [snip] I have had some
        > success in using the program, so I have written up a piece about how
        > to use it, and posted it at http://www.athelstane.co.uk/kromsate.htm

        Thanks for the "tutorial". I hope sometime soon to try it out,
        especially to determine the algorithms it uses for deskewing and if
        it works for higher-resolution, 24-bit color depth images.


        > On the subject of how straightening ought to be done, which was
        > followed up here by a note from Jon Noring, I should say that some
        > years ago I wrote a "proper" straightener, just in the way that Jon
        > indicates. Having decided how many degrees, or fractions of degrees,
        > to do the rotation with, you need to decide on the mid point of the
        > file containing the text to be straightened. Every black pixel is
        > then moved to a white target file, using a displacement calculated
        > using sines and cosines. That is in fact the only really valid way
        > to do the straightening.

        Yes, a deskewing algorithm has to rotate the page around some point --
        the best may be the exact center of the raster image so there's no
        need to apply an X-Y translation (but the point of rotation doesn't
        need to be the center -- it can be anywhere on the page or even
        outside of it.)


        > But there is one thing that people often forget, and that is that
        > doing this introduces a population of speckles.

        Hmmm.


        > This is because the transformations performed on each pixel may
        > result in two pixels landing at the same place on the output file,
        > with the result that there may be a white speck in the middle of
        > black text. Or, of course a black speck just in the white
        > background. That is why despeckling and possibly anti-aliasing is
        > quite an essential part of straightening, removing not just black
        > pixels in the midst of white ones, but white pixels in the midst of
        > black ones.

        However, if one starts with a higher-resolution, 24-bit (RGB) scan,
        there's a lot more information there to use. And higher-end
        algorithms will do proximity analysis and interpolation to determine
        the final RGB value for the target pixel. Doing this should eliminate
        the "speckling". Most commercial image rotation algorithms found in
        high-end graphics programs, like Paint Shop Pro and of course
        Photoshop, use higher-level interpolation (although each still needs
        to be thoroughly tested before using in a project.)

        If all one has is a bitonal (black and white) image, one way around
        this speckling problem using the simple algorithm is to first greatly
        increase the image resolution by an integer multiplier, preferable
        odd (3x, 5x, 7x, etc). This effectively splits a single pixel (whether
        black or white) into a large number of pixels which are assigned the
        same color. So 1 pixel becomes 4, 9, 16, 25, 36, 49, etc. pixels.
        Rotate the up-sampled image using the simple algorithm, then
        downsample back to the original resolution using the 50% threshold
        value. I think this will eliminate the speckling if the integer
        multiplier is high enough (what high enough is, though, needs to be
        determined by experiment.)


        > And then there is the business of deciding what the angle of
        > rotation is.

        Yes, this is actually the biggest problem. Tools such as Paint Shop
        Pro makes this pretty easy to do manually (draw a line which follows
        the line of text), but it is a lot of work if one has a 300 page
        book. (And things get interesting when the text itself has variable
        skewing due to the original typesetting being wonky -- here one needs
        to determine the "best" deskew angle. Ideally, what one would like to
        do for these pages is a higher-level page transformation by
        determining how the skew angle varies over the page. This is doable,
        but it simply may not be worth trying to develop.)


        > Somehow SkanKromSator seems to have got all these elements of the
        > job quite perfect.

        I'll have to try it! I'm curious to know the algorithm(s) it uses for
        the actual rotation and how it handles 24-bit RGB page scans.


        > I am sorry that Jon has not time to check SkanKromSator out...

        <smile/>

        Well, I guess I'll have to make the time and do so soon, since for a
        text digitization project I'm involved with, one item on my "to do"
        list is to survey deskewing and cropping tools.


        > What I had asked for his input upon was that there are half a dozen
        > options for the algorithm to do the straightening.

        Really? Interesting.


        > As it takes five minutes for the program to process automatically
        > through a normal-sized book...

        Well, I can give it a "book" of a few pages. <smile/>

        I have a high-rez, 24-bit scan set of a poorly typeset and poorly
        printed book which has variable skewing between and within pages. It's
        a good "acid test" of deskewing applications.


        > All I can say is that the results I am obtaining using this rapid
        > and free program appear indistinguishable from the best I can get
        > using any other method known to me, and it is a real pleasure
        > working with the results.

        Again, thanks for your recommendation -- I will definitely look at
        SkanKromSator. (Geez, what an awful name for English speakers -- the
        capital 'K' helps, but still, it's hard not to look at the first 5
        letters... If I try to communicate with the program's author, I'll
        mention this.)


        > I shall be away from tomorrow for a few days, in Scotland, on an
        > island 100 miles from the mainland, so will not easily be able to
        > deal with correspondence for a week, but we do have broadband on the
        > island, so I will be able to get emails if I can just get near a
        > wi-fi source.

        Hope you enjoy your trip! It sounds like a great place to visit, at
        least in the summer months.


        Jon Noring
      • Jon Noring
        Everyone, At Nick s request, I looked at the four different deskewing algorithms used by SkanKromsator to determine if any used true interpolative rotation
        Message 3 of 9 , Aug 4, 2008
        • 0 Attachment
          Everyone,

          At Nick's request, I looked at the four different deskewing algorithms
          used by SkanKromsator to determine if any used true interpolative
          rotation (versus lower quality "shearing" algorithms) and if, in my
          estimation, any are robust enough to use in serious scanning projects
          where maintaining the highest image quality throughout the work flow
          is important.

          The image file set of my simple test (all in lossless PNG format) are
          found here:

          http://www.windspun.com/SkanKromsator/

          I only looked at deskewing for high-resolution, 24-bit images. I did
          not look at despeckling (which I turned off in SkanKromsator) and did
          not look at cropping. I did not look at color depths less than 24-bit,
          such as bitonal images. Others are welcome to test bitonals since I
          have zero interest in using bitonals for any serious image processing.
          (I only consider bitonals a derivative, essentially throw-away,
          end-product for uses that call for them -- I'll never "master" in
          bitonal.)

          There are actually two parts to this evaluation: 1) determining the
          skew angle of the text, and 2) the actual image deskewing, both of
          which I'll discuss later.

          To conduct this test, I used a high-rez, 24-bit color depth lossless
          PNG of a page scan image which had previously been deskewed with Paint
          Shop Pro (using a manual procedure.) Based on prior tests, including
          comparison with Photoshop, I determined the rotational algorithm used
          by Paint Shop Pro to be excellent, pretty much identical in quality to
          that used in Photoshop (it should be obvious now that I like PSP a lot
          better than PhotoShop -- I've used PSP for years.)

          I then drew a one pixel wide black border on this image and increased
          the canvas size with a white background. The resulting source image
          is called 'original.png'. This black border is important to look at
          the subtleties of the various deskewing algorithms.

          Paint Shop Pro was then used to skew this source image three degrees
          counterclockwise, resulting in image 'skew-3deg-left-by-PSP.png'. This
          is the image I used to test deskewing with SkanKromsator (which I'll
          hereafter refer to as 'SK'.)

          [As an aside, I quickly determined that the default settings in SK are
          not geared towards 24-bit, high-resolution processing, so I had to go
          through the settings and make sure SK would not downsample the pixel
          size nor decrease the color depth. (And of course to turn off
          despeckling for this test.) This is unfortunate -- SK should, by
          default, do no downsampling of ppi nor color depth when it deskews.]

          As mentioned above, SK uses four different algorithms for deskewing:
          antialias, interpolation, shear, and fast. There are two more in the
          select list (thus the six which Nick asked I test), but I determined
          these extra two are simply "auto" which allows SK to pick one of the
          four algorithms based on some unknown criteria. Again, I do not like
          losing control this way, especially with a graphics program that is
          not well documented in English -- so I rarely use 'auto' on anything.

          The SK deskewed images are found in the directory (the two "auto"
          images are also there), which are obvious by their file names. For
          further comparison, I deskewed the test image with a three degree
          clockwise rotation using PSP.

          The best way to see the effect of the different algorithms is to look
          at the one pixel wide border, to see what happened to it during the
          deskewing.

          In my estimation, both the antialias and interpolation algorithms are
          as good as PSP, using true rotational/interpolative algorithms. I
          could not objectively determine if one is better than the other in
          deskewing, but today as I relooked at the images I slightly prefer
          interpolation for the 24-bit images -- your mileage may vary.)

          By contrast, the shear and "fast" algorithms use faster "shear"
          algorithms where blocks of pixels are simply shifted as needed to
          compensate for the skew. This is seen in the "stair-stepping" of the
          border. This, of course, will result in a type of "rhomboidal"
          distortion of the characters.

          The other thing to test is how well SK determines the skew angle.
          Clearly SK did not deskew by exactly three degrees, but it was close.
          The original source image, since it was manually deskewed, may
          have not been the best it could have been.

          So, I conclude that SK is appears to be good at determining the angle
          of skew (something which I'd test further if I ever plan to use SK),
          and does a good job at deskewing using either antialias or
          interpolation.

          Hope someone finds this simple test to be useful, and will contribute
          more analysis of the images -- to fill in the gaps that I miss here.

          Jon Noring
        • Lars Aronsson
          ... If you are scanning printed books or newspapers, many of them are also derivative, throw-away products made for a mass market, which makes bitonal scanning
          Message 4 of 9 , Aug 7, 2008
          • 0 Attachment
            Jon Noring wrote:

            > (I only consider bitonals a derivative, essentially throw-away,
            > end-product for uses that call for them -- I'll never "master"
            > in bitonal.)

            If you are scanning printed books or newspapers, many of them are
            also derivative, throw-away products made for a mass market, which
            makes bitonal scanning (TIFF G4, 600 dpi) a perfect match. And for
            rotating a bitonal image, any anti-aliasing interpolation is
            pointless, since you can only store 1 and 0 in the output anyway.
            The only way to do it right is the fast shear-based rotation that
            moves the bitonal pixels around (Alan W. Paeth's algorithm).

            I haven't tried SkanKromsator, but I have successfully used such
            rotation on bitonal images of scanned books. Rotation is easy,
            the hard part is detecting how many degrees you need to rotate.
            Does SkanKromsator do that?


            --
            Lars Aronsson (lars@...)
            Project Runeberg - free Nordic literature - http://runeberg.org/
          • Jon Noring
            ... From the very limited test I did, SK did a good job at detecting the skew angle. However, one needs to feed SK a test suite of different page scans before
            Message 5 of 9 , Aug 7, 2008
            • 0 Attachment
              Lars Aronsson wrote:

              > I haven't tried SkanKromsator, but I have successfully used such
              > rotation on bitonal images of scanned books. Rotation is easy,
              > the hard part is detecting how many degrees you need to rotate.
              > Does SkanKromsator do that?

              From the very limited test I did, SK did a good job at detecting
              the skew angle. However, one needs to feed SK a test suite of
              different page scans before one can conclude about the quality of
              its skew detection algorithm.

              Jon
            • Nick Hodson
              Yes. That is the whole point. It does it beautifully. Nick (away on holiday). ... If you are scanning printed books or newspapers, many of them are also
              Message 6 of 9 , Aug 7, 2008
              • 0 Attachment
                Yes. That is the whole point. It does it beautifully. Nick (away on holiday).

                Lars Aronsson <lars@...> wrote: Jon Noring wrote:

                > (I only consider bitonals a derivative, essentially throw-away,
                > end-product for uses that call for them -- I'll never "master"
                > in bitonal.)

                If you are scanning printed books or newspapers, many of them are
                also derivative, throw-away products made for a mass market, which
                makes bitonal scanning (TIFF G4, 600 dpi) a perfect match. And for
                rotating a bitonal image, any anti-aliasing interpolation is
                pointless, since you can only store 1 and 0 in the output anyway.
                The only way to do it right is the fast shear-based rotation that
                moves the bitonal pixels around (Alan W. Paeth's algorithm).

                I haven't tried SkanKromsator, but I have successfully used such
                rotation on bitonal images of scanned books. Rotation is easy,
                the hard part is detecting how many degrees you need to rotate.
                Does SkanKromsator do that?

                --
                Lars Aronsson (lars@...)
                Project Runeberg - free Nordic literature - http://runeberg.org/





                [Non-text portions of this message have been removed]
              • Jon Noring
                Lars and Nick s replies seem to imply a view that 600 dpi bitonal scanning is more than sufficent for all text digitization projects. (Other than the need for
                Message 7 of 9 , Aug 7, 2008
                • 0 Attachment
                  Lars and Nick's replies seem to imply a view that 600 dpi bitonal
                  scanning is more than sufficent for all text digitization projects.
                  (Other than the need for color or higher resolution scans for
                  illustrations.)

                  If this is what they imply, I have to respectfully disagree.

                  Rather, I have stated what I consider should be the default scanning
                  requirements for every nascent text digitization project. Mine is a
                  conservative position, and if a decision is made to relax the scanning
                  requirements, for whatever reason(s), it is done with full knowledge
                  of the downsides of such relaxation.

                  Hopefully those here involved with text digitization projects will
                  provide their perspectives on scanning resolution and color depth. It
                  is an important discussion since nearly all the projects here employ
                  scanning in one form or another -- it is the one area of commonality.

                  In a prior reply, I outlined two reasons for my position:

                  1) Flexibility. For texts with typical yellowed paper, 600 dpi, 24-bit
                  color allows much greater flexibility for post-scan image
                  processing, even if the primary use (such as OCR) will be low-rez
                  bitonals. Such low-rez bitonals can trivially be generated from the
                  original hi-rez/full-color scans, with the important ability to
                  vary the reduction parameters such as bitonal threshold, and to use
                  selected color channels.

                  For example, I mentioned that Lee and I have done some preliminary
                  OCR experiments where I took a 600 dpi, 24-bit page scan (such as
                  the one I placed online for SkanKromsator deskewing experiment --
                  link below) and generated a set of derivatives using Paint Shop Pro
                  based on the following values matrix (these are off the top of my
                  head -- my memory for the specifics may be a little off without
                  digging out all the test files, currently buried somewhere):

                  600 dpi vs. 300 dpi
                  24-bit vs. grey vs. bitonal
                  red vs. green vs. blue (color channels, each is "greyscale")

                  (And for conversion to bitonal, several bitonal threshold
                  values of 70 through 190 in steps of 10 -- this was done for
                  both the original and for each of the three color channels.)

                  (This leads to a significant number of possibilities...)

                  The OCR results, and how they varied, were quite interesting, and
                  strongly suggest that it is a good thing we preserve hi-rez, high-
                  color-depth originals. It also suggests ways to increase OCR
                  accuracy by comparing/mixing/analyzing the results from different
                  derivatives of the original scan.

                  If a project captures 600 dpi bitonals from the scanner, they
                  forever lose potentially useful information.

                  2) Preservation and Direct-Use. Regardless of how some here view old
                  documents as "throw away", it should be the default position that
                  preserving the original look of the documents is important, even if
                  they are bland ink on off-white paper and originally sold as "pulp
                  rags".

                  For example, here's the 600 dpi, 24-bit page scan from the "My
                  Antonia" scan set I produced a couple years ago:

                  http://www.windspun.com/SkanKromsator/original.png

                  (Yes, quite a large image -- JPEG 2000 compression used gently
                  at 90% size reduction incurs no loss of quality -- any image
                  changes from such lossy compression are roughly comparable to the
                  natural noise level of the scanner, as I reported here a while
                  back.)

                  It is clear that this looks *nice* and truly preserves the original
                  page (ignore what I did with the border and white canvas -- that
                  was applied for the deskewing experiment. I have the original
                  "original" available.)

                  The bitonal equivalent of this image royally sucks for direct use
                  and for preservation. And the greyscale version (even if from the
                  preferred red channel) has lost some punch. There's simply no
                  comparison.

                  And consider that your scan of the document/book may be the only
                  one to ever be preserved. That old pulp rag may be "pulpy" to you,
                  but your scan set may be the only one ever preserved for that rare
                  but junky pulp rag... What do you want to pass on to future
                  generations?


                  To summarize, I believe that every text digitization project should,
                  as a starting default position in their planning, consider scanning
                  all texts at the minimum 600 dpi and 24-bit depth color (for
                  illustrations, even higher rez as Juliet Sutherland at DP does -- the
                  reasons are for another topic.)

                  If the default resolution requirements are too burdensome for the
                  particular project (e.g., it takes too long to scan a page and there's
                  not enough funds available to buy a higher-speed scanner), then
                  scaling back the requirements can certainly be done, so long as the
                  decision is made with "eyes wide open."

                  (For compromise, I would at least try to capture 600 dpi grey scale
                  or 300 dpi full-color -- the choice between the two depends upon
                  a few factors I won't outline here. For greyscale, if the paper is
                  significantly yellowed to consider capturing the red or green channel.
                  For example, for the scan set associated with the above-linked page
                  scan image, overall the red channel gives the best visual grey-scale
                  quality -- significantly so. The green is next best, and, as expected,
                  the blue channel is dreadful.)

                  Nevertheless, in my opinion these are unfortunate compromises, and
                  should not be touted as preferred general practice.

                  My only goal here is to make the dozens of text digitization projects
                  represented in this group aware of the issues as I see them, so any
                  decision they make in the future on what they capture and preserve
                  with their scanners is done based on weighing the various factors.
                  To make decisions with "eyes wide open."

                  Jon Noring
                • David Starner
                  ... That s a pretty bizarre reading IMO; I infer from that statement that you are, blindly or willingly, polarizing the discussion. ... As Alice Cooper sings:
                  Message 8 of 9 , Aug 7, 2008
                  • 0 Attachment
                    On Thu, Aug 7, 2008 at 8:33 PM, Jon Noring <jon@...> wrote:
                    > Lars and Nick's replies seem to imply a view that 600 dpi bitonal
                    > scanning is more than sufficent for all text digitization projects.
                    > (Other than the need for color or higher resolution scans for
                    > illustrations.)

                    That's a pretty bizarre reading IMO; I infer from that statement that
                    you are, blindly or willingly, polarizing the discussion.

                    > 2) Preservation and Direct-Use. Regardless of how some here view old
                    > documents as "throw away", it should be the default position that
                    > preserving the original look of the documents is important, even if
                    > they are bland ink on off-white paper and originally sold as "pulp
                    > rags".

                    As Alice Cooper sings:
                    "There is one thing
                    I mean everything has a price
                    I really hate to repeat myself
                    But nothing's free"

                    I could, I should, backup my 21 GB directory of scans and PG stuff.
                    There's no way I could do so if it were 10 times the size; heck, it
                    wouldn't fit on my hard-drives even without all the other junk that's
                    on there. There's no way this is worth it for me.
                  • Jon Noring
                    ... Hmmm, ok, it is not my intent to polarize the discussion, but if my prior message came out that way, I apologize. It should be clear by now that I feel
                    Message 9 of 9 , Aug 7, 2008
                    • 0 Attachment
                      David Starner wrote:

                      > Jon Noring wrote:

                      >> Lars and Nick's replies seem to imply a view that 600 dpi bitonal
                      >> scanning is more than sufficent for all text digitization projects.
                      >> (Other than the need for color or higher resolution scans for
                      >> illustrations.)

                      > That's a pretty bizarre reading IMO; I infer from that statement
                      > that you are, blindly or willingly, polarizing the discussion.

                      Hmmm, ok, it is not my intent to polarize the discussion, but if my
                      prior message came out that way, I apologize. It should be clear by
                      now that I feel quite strongly on this topic.

                      Hopefully others will weigh in with their thoughts, and of course,
                      criticisms, on what I've written.


                      >> 2) Preservation and Direct-Use. Regardless of how some here view
                      >> old documents as "throw away", it should be the default position
                      >> that preserving the original look of the documents is important,
                      >> even if they are bland ink on off-white paper and originally sold
                      >> as "pulp rags".

                      > As Alice Cooper sings:
                      > "There is one thing
                      > I mean everything has a price
                      > I really hate to repeat myself
                      > But nothing's free"
                      >
                      > I could, I should, backup my 21 GB directory of scans and PG stuff.
                      > There's no way I could do so if it were 10 times the size; heck, it
                      > wouldn't fit on my hard-drives even without all the other junk
                      that's
                      > on there. There's no way this is worth it for me.

                      Agreed that each text digitization project has to come to terms with
                      resource limitations.

                      Personally, backing up 200 gigs is easy for me. Those projects which
                      have some funds can get backup SATA hard-drives for dirt cheap --
                      and/or simply write a bunch of 5 gig DVD-ROMs. For hard-disk backup I
                      use the Thermaltake BlacX -- wonderful device -- highly recommended.

                      A few years ago the biggest argument against capturing hi-rez, full
                      color scans was disk space. Today, for even minimally funded projects,
                      this is no longer an issue. Even scanners are faster, and those with
                      $$$ can buy commercial scanners using digital cameras (e.g. Atiz)
                      which are ultra-fast.

                      And with the development of JPEG 2000, we can now capture a typical
                      300 page book at 600 dpi, 24-bit color and fit the raw scanset on a
                      CDROM with no real loss of quality (as I've noted before -- do no
                      more than 90% size reduction of the original bitmap.)

                      The only argument left against doing 600 dpi/24-bit scan sets is time
                      -- the added time to scan each page. For digital camera scanners, this
                      is not an issue. And I believe even the cheaper flatbed scanners are
                      today faster than those from a few years ago.

                      Jon Noring
                    Your message has been successfully submitted and would be delivered to recipients shortly.