Re: Thursday chat about Text Converter
First, Thank you! for great progress in the last few weeks. We've
really achieved a lot. We're close to making it easy to publish ebooks
for DVD players, mobile phones, etc.
I think now we want to know, What kinds of ebooks would be most useful
for our participants in Africa, India, and so on? That should dictate
how we focus our technological efforts.
Here are some opportunities:
* Josephat Ndibalema is championing Kiswahili, (see our wiki!
http://www.worknets.org/sw/ ) and in general, we can serve local and
* Graham Knight's Do-It-Yourself DIY Solar (solar panels for recharging
mobile phones) is a good example of a project that we could support with
ebooks in local languages, as are your many pages at
http://www.worknets.org/wiki.cgi?Ricardo and Franz Nahrada also wants
us to engage participants at Thingiverse who might work together with
us, and we might try out their projects.
* Ananya Guha (India) has suggested children's books and that is also a
focus of Masimba Biriwasha (Zimbabwe). Indeed, it would be good for us
to collect, share and write children's stories and study them in terms
of the values that we and they are espousing. That's a great way to
research our own values, including our deepest values, and to create a
new culture, growing with children.
* Janet Feldman's guide for Blogging Positively and other such materials
are also great.
It's good to learn of the new EPUB standard for ebooks. However, it
seems that EPUB is just a way to package HTML pages. The HTML pages
need to be written "correctly", unambiguously as XHTML, and they are
organized with a side bar and zipped together. But otherwise there
doesn't seem to be any constraint on them. Indeed, they can even
include CSS stylesheets. So writing an EPUB converter is comparable to
writing an HTML browser (and one that will look nice on all kinds of
screens). That's outside of my resources. As you note, there are sites
like http://feedbooks.com that are doing this automatically, converting
EPUB into all kinds of formats. Indeed, http://feedbooks.com is
wonderful for us - not only do they publish classics in the Public
Domain, but they also require new authors to publish in the Public
Domain! And they are attracting authors!
Similarly, creating a PDF browser or PDF parser is unrealistic and
unecessary for us.
Instead, we already have a converter that works. And improving it a bit
will make it work quite well. What we're achieving is our own format, a
very simple way to organize text into paragraphs, without any line
breaks, and some assumptions as to what will happen when paragraphs are
just a line or two (they will be listed on the same screen), and when
they are longer than that (they will get their own screen). We don't
need any use of italics, bold, font size, page display or anything else,
for our purposes, as far as I know. Instead, we have a real need to
intersperse (and perhaps to overlay) photos, audio, video, diagrams.
Our texts will determine our needs. We need texts!
This is like the difference between websites and emails. We use emails
(typically, plain text) for most of our activity and that is why we are
so inclusive. Whereas much of the world's activity is oriented around
websites. Our strength is in activity where we can minimize the
importance of layout and focus on content. We're coming up with that
kind of ebook and others may not.
Some steps we could take:
* Choose texts that one or more of our participants want.
* Translate them to local languages as needed.
* Hand craft some custom made ebooks - improve our tools accordingly.
* Develop our own format for "Electric Books" (or however we call it).
* Ask http://feedbooks.com to convert ePub to our format (and from
there, possibly, to various output devices format)
* Ask somebody else to convert PDF to our format.
Then, we can build and encourage custom converters that take assorted
text (like email archives) and strip out the paragraphs, which is to
say, convert them to our format. We can also encourage people to write
ePub books or PDF books, yet still, they will get better results if they
simply output the text.
I think we're having very good success with our own "formatless" format
and that is something we can build on. The value of our format is that
it is maximally flexible, it can be satisfactory on all devices, and it
allows for maximally rapid publishing (especially relevant if we add
photos, audio and video).
Much to discuss tomorrow!
Thank you, Ricardo and all,
+370 699 30003
> Hi Andrius
> this is just to let you know I'll be at the Thursday 10th Sept 16:00
> EAT Worknets chat, where we can chat about the Text Converter, etc.
> If you haven't seen it already, could you look at...
> "Text Converter way forward - Using epub as the input format"
> Here's a summary...
> If we start with plain-text, then I think we will always be struggling
> to get the output looking right.
> I think we need to start with a format that identifies the various
> elements of a document (ordinary prose paragraphs or lists, etc). We
> can use a format based on XML, a Markup Language, with tags to
> identify the elements. The emerging standard for eBook distributon is
> epub, so that may be a good choice. Authors/publishers create their
> eBooks in epub format or convert a word-processor document to epub.
> They send the eBook in a single file format, epub, to eBook
> distributor websites, like Amazon or eBook.com, instead of the old way
> of sending many different formats. The distributor website then uses
> converters to allow downloads in many eBook Reader formats
> (lit/pdf/pdb/txt/etc and epub itself).
> Authors can submit eBooks to us in epub format. We don't care how
> different authors create their epub files. They may write and format
> an eBook in their favourite word-processor, then convert
> .doc/.odt/etc to .epub, using an existing converter program.
> Alternatively, they may use a word-processor/eBook authoring program
> that outputs epub directly.
> For math books, you could use LyX as your word-processor, because of
> it's excellent formatting of math formulae, save the .TeX file and
> convert to epub using an existing converter program.
> We would produce a converter to convert epub to JPEG files, for
> viewing on DVD Player+TV.
> We wouldn't need to concern ourselves about other eBook Reader
> devices. People can use existing epub-to-<other eBook format>
> converters, for example epub-to-pdf/lit/pdb/txt/etc.
> This is WHAT i propose we do. I haven't gone into HOW yet, but we
> should try to just use existing free open-source library functions or
> programs, rather than writing and maintaining a lot of code for years
> to come, and dealing with feature-requests. I'd like you to look at
> the message, which points to new epub sections on our Worknets page,
> and let me know what you think about this way forward.
> An epub-to-JPEG converter could be generally useful, to other
> people/projects/devices, so we might publish it for anyone to use or
> adapt. For example, for owners or manufacturers of Digital Photo
> Frames or other devices that aren't intended to run programs.