Loading ...
Sorry, an error occurred while loading the content.
 

Re: Online Text Converter for eBooks

Expand Messages
  • Andrius Kulikauskas
    Note to Janet, Edward: Ricardo and I are creating an online (and then later, offline) converter that converts (lays out) any text into a series of JPG images
    Message 1 of 2 , Aug 30, 2009
      Note to Janet, Edward: Ricardo and I are creating an online (and then
      later, offline) converter that converts (lays out) any text into a
      series of JPG images (or even an MPG movie) that can then be viewed on a
      television with a DVD player or with a mobile phone or ebook reader (as
      a PDF file).

      Meredith, It's great to hear from you. Your letters are wonderfully
      helpful. I will try out the Latex approach, likely with Python. I
      avoided using Latex in my math studies because it was not "What you see
      is what you get" and my thesis was 200 pages long. And I didn't want to
      have a math career. I was the first Ph.D. student at UCSD (in 1993) to
      get away with using Microsoft Word with Equation Editor.

      I will work a bit more and I think, if what we're creating is novel, I
      should look for some funding for this because I think it would be very
      useful and we have the network to explore and implement that. For
      example, there's the Python Software Grants program.
      http://www.python.org/psf/grants/

      I wonder if the Bible is available in such a format? and who might
      sponsor the conversion? including, possibly, distribution?

      Or other texts (including advertisements)?

      It would be good if I could earn about $2,500 for one month's work (more
      likely, two or three months part-time) and it would be great if Minciu
      Sodas participants could earn another $2,500 for working this out
      on-the-ground and achieving some project in distribution, training,
      tutoring (reading etc.). So the total amount would be about $5,000.

      Andrius

      Andrius Kulikauskas
      Minciu Sodas
      http://www.ms.lt
      ms@...




      Meredith L. Patterson wrote:
      > On Sat, Aug 29, 2009 at 9:51 PM, ricardoolpc<ricardoolpc@...> wrote:
      >
      >> Do you need to double-up forward-slashes in file paths, so / becomes //. The
      >> reason is in some languages like C, a slash followed by a character is a
      >> single control character. For example, /n is newline, /r is return, /t is
      >> tab, etc.
      >>
      >
      > You're thinking about backslashes, not forward slashes. \n is newline,
      > \t is tab, &c.
      >
      >
      >> Likewise, calculating how many characters will fit on a line is very hard to
      >> predict in advance, with a proportional font (variable width characters for
      >> W, i, etc). It may be best to start with a fixed-width font like Courier
      >> New, display some text and measure how many pixels wide each character is.
      >> Hence, how many characters fit on a line.
      >>
      >
      > You guys are trying to reinvent the wheel here. Why not just use free
      > software that was designed for exactly the task you're trying to
      > accomplish? LaTeX will do all of this for you.
      >
      >
      >> save it to a file (perhaps a simple 24-bit/3-bytes per pixel for
      >> red/green/blue uncompressed .BMP file) and split it/convert it into multiple
      >> JPEG, one per page.
      >>
      >
      > Egad, no -- why waste so much space if all you're dealing with is text
      > and bandwidth is at a premium? If you're absolutely dead set on
      > reinventing the wheel, then at least use PNG at 1 bit per pixel, for
      > plain black and white -- if you want anti-aliasing, use grayscale at,
      > say, 4 bits per pixel.
      >
      >
      >> An uncompressed BMP file is just a fixed sized
      >> information header block, then 3 bytes per pixel, for 24-bit colour, so
      >> fairly simple to chop into Raw image files for pages, then convert each raw
      >> file to JPEG with an off the shelf, command-line image converter. For
      >> example each 1024 x 768 page would be a 0.75 Megabyte block of data.
      >>
      >
      > Nope. If you're using 24-bit colour, that's 1024 * 768 * 3 = 2359296
      > bytes, roughly 2.3 MB. In contrast, 1bpp is 98304 bytes and 4bpp is
      > 393216 bytes.
      >
      >
      >> Some of this could be a command-line batch file, with a sequence of 2 or 3
      >> file converters. On Linux PCs, programs can pipe the output file from one
      >> program into another on the command line, using > i think.
      >>
      >
      >
      >> redirects the output of a file to stdout. The pipe operator, |, redirects the output of one process to the input of another, e.g.:
      >>
      >
      > grep foo bar | wc -l
      >
      > uses 'grep' to search for all occurrences of the string "foo" in the
      > file ./bar; the output of grep is one occurrence per line, so feeding
      > that to 'wc' with the -l flag counts those lines.
      >
      > Or, if you were to write, say, a sed or python script to format your
      > text files using LaTeX markup, generating a PDF would look like this:
      >
      > python markupscript.py | latex | dvi2ps | ps2pdf - output.pdf
      >
      > or if you absolutely had to have image files,
      >
      > python markupscript.py | latex | dvi2ps | ps2pdf - | convert <options>
      >
      > --mlp
      >
      >
      > ------------------------------------
      >
      > Please note our rule: Each letter sent to this group enters the Public Domain unless it explicitly states otherwise. http://www.ethicalpublicdomain.orgYahoo! Groups Links
      >
      >
      >
      >
      >
    • Andrius Kulikauskas
      Edward, Yes, that converts PDF to JPG. But instead we want to input ASCII (or Unicode) text (typically, input with a browser into a text box, or uploaded as a
      Message 2 of 2 , Sep 2, 2009
        Edward,
        Yes, that converts PDF to JPG.
        But instead we want to input ASCII (or Unicode) text (typically, input
        with a browser into a text box, or uploaded as a text file) and lay it
        out line by line and paragraph by paragraph so that it works for a
        chosen image size (screen size) say 320x480 as JPG files that could then
        be viewed with a DVD player, a mobile phone, an ebook reader, a digital
        picture frame, etc.
        Do you know of any program or script that can do that? I haven't found
        any yet so I'm looking but also creating my own code. Right now I'm
        setting up ImageMagick on our server as you suggest. (Not trivial to
        address all the dependencies.)
        Also, as I wrote earlier, there's difficulties with finding and using
        open source font files, I haven't understood all the issues yet. As
        Meredith suggested, I will look into Latex.
        I think this is all relevant for open source textbook publishing and
        distribution, too. It's likely that it hasn't been done (or not much)
        and it's a very promising direction (and I imagine, set of business
        opportunities).
        Andrius

        Andrius Kulikauskas
        Minciu Sodas
        http://www.ms.lt
        ms@...
        +370 699 30003
        Dukiskes, Lithuania


        Edward Cherlin wrote:
        > On Tue, Sep 1, 2009 at 11:46 PM, ricardoolpc<ricardoolpc@...> wrote:
        >
        >> Hi Ed
        >>
        >> You said "There are numerous such programs in existence under GPL". As others have already pointed out, there's no need for Andrius and I to re-invent the wheel, if there are suitable programs already. Do you know any specific programs we could look at? Do they do the whole text file-to-JPEG pages process in one go, or just part of it?
        >>
        >
        > You can search for Free Software using a variety of tools, including
        > Google and various package management tools. I use Synaptic on Ubuntu.
        >
        >
        >> We're already looking at the LaTeX typesetting program for authoring eBooks in a Device-Independent form (no specific No. of characters per line). It handles math symbols very well for Andrius's book. It can output a DVI file, then we can use dvipng to convert them to PNG images, and ImageMagick to convert PNG to JPEG. If there's a good GPL-Licensed program that converts text file-to-JPEG pages in one go, that would be even better.
        >>
        >
        > No, that's what I was thinking of, but under script control. But this
        > is what you are looking for.
        >
        > http://ubuntuforums.org/showthread.php?t=489877
        >
        > Convert PDF to JPG
        > Thanks, Imagemagick does the trick.
        >
        > It's as simple as
        >
        > Code:
        >
        > convert abc.pdf abc.jpg
        >
        > Imagemagick installs on Ubuntu as
        >
        > /usr/bin/compare
        > /usr/bin/animate
        > /usr/bin/convert
        > /usr/bin/composite
        > /usr/bin/conjure
        > /usr/bin/import
        > /usr/bin/identify
        > /usr/bin/stream
        > /usr/bin/display
        > /usr/bin/montage
        > /usr/bin/mogrify
        >
        > >From man convert:
        >
        > NAME
        > convert - convert between image formats as well as resize an image,
        > blur, crop, despeckle, dither, draw on, flip, join, re-sample, and much
        > more.
        >
        > SYNOPSIS
        > convert input-file [options] output-file
        >
        > OVERVIEW
        > The convert program is a member of the ImageMagick(1) suite of tools.
        > Use it to convert between image formats as well as resize an image,
        > blur, crop, despeckle, dither, draw on, flip, join, re-sample, and much
        > more.
        >
        > For more information about the convert command, point your browser to
        > file:///usr/share/doc/imagemagick/www/convert.html or
        > http://www.imagemagick.org/script/convert.php.
        >
        > The other utilities do a variety of other useful things, so you should
        > take a look at the other man pages.
        >
        >
        >> Our main discussion on publishing eBooks as JPEGs for DVD players is in the learnhowtolearn yahoo group.
        >>
        >> http://tech.groups.yahoo.com/group/learnhowtolearn/
        >>
        >> Ricardo
        >>
        >>
        >> --- In earthtreasury@yahoogroups.com, Edward Cherlin <echerlin@...> wrote:
        >>
        >>> There are numerous such programs in existence under GPL.
        >>>
        >>> On Fri, Aug 28, 2009 at 2:12 PM, Andrius Kulikauskas<ms@...> wrote:
        >>>
        >>>> Ricardo,
        >>>>
        >>>> I worked a bit today to create an online text-to-jpg for eBooks. Â I
        >>>> created a wiki page:
        >>>> http://www.worknets.org/wiki.cgi?OnlineTextConverter
        >>>>
        >>>> I'll be using PHP's GD graphic library, there are functions like:
        >>>> http://us.php.net/manual/en/function.imagefttext.php  Free Type fonts
        >>>> http://us.php.net/manual/en/function.imagepstext.php  PostScript Type 1
        >>>> fonts
        >>>> http://us.php.net/manual/en/function.imagettftext.php  True Type fonts
        >>>>
        >>>> Ricardo, please, can you find me a font file that you'd like to use?
        >>>>
        >>>> Also, can you research, should we use Free Type fonts? Â What are the
        >>>> patent issues otherwise?
        >>>>
        >>>> Perhaps others have ideas or would like to help? Any test texts we might do?
        >>>>
        >>>> Andrius
        >>>>
        >>>> Andrius Kulikauskas
        >>>> Minciu Sodas
        >>>> http://www.ms.lt
        >>>> ms@...
        >>>>
        >>>>
        >>>>
        >>>>
        >>>> ------------------------------------
        >>>>
        >>>> Please note our rule: Each letter sent here enters the Public Domain unless it explicitly notes otherwise. In case your legal system does not recognize this claim, you may use these letters under CC0  "No Rights Reserved". http://creativecommons.org/about/cc0
        >>>>
        >>>> Yahoo! Groups Links
        >>>>
        >>>>
        >>>>
        >>>>
        >>>>
        >>>
        >>> --
        >>> Edward Mokurai Cherlin
        >>> Silent Thunder (默雷/धरॠममेघशबॠदगरॠज/دھرممیگھشبدگر ج) is my name, and
        >>> Children are
        >>> my nation. The Cosmos is my dwelling place, the Truth my destination.
        >>> http://earthtreasury.org/
        >>>
        >>>
        >>
        >>
        >> ------------------------------------
        >>
        >> Please note our rule: Each letter sent here enters the Public Domain unless it explicitly notes otherwise. In case your legal system does not recognize this claim, you may use these letters under CC0 — "No Rights Reserved". http://creativecommons.org/about/cc0
        >>
        >> Yahoo! Groups Links
        >>
        >>
        >>
        >>
        >>
        >
        >
        >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.