Loading ...
Sorry, an error occurred while loading the content.

Re: [Pali] Re: CSCD to text file conversion utility

Expand Messages
  • Hans Van Slooten
    Dear Bhante, Thanks for the information. It should be fairly trivilal to modify my application to output a UTF-8 text file with the proper Unicode encodings.
    Message 1 of 8 , Jun 9, 2004
    • 0 Attachment
      Dear Bhante,

      Thanks for the information. It should be fairly trivilal to modify my application to output a UTF-8 text file with the proper Unicode encodings. I may have something in within the next couple of days for the list to examine.

      Thanks again for the information,
      Hans


      On Wednesday, June 09, 2004, at 08:25AM, Bhikkhu Pesala <pesala@...> wrote:

      >Unicode is not a font, but an international standard www.unicode.org for
      >the allocation of characters in most languages. At the moment, Pali
      >scholars use all kinds of different character mappings. Most are limited
      >to just the ANSI character set, which is not enough. Unicode fonts use
      >double-byte encoding to allow for more than 64,000 characters. The Pali
      >characters are in LatinExtendedA and LatinExtendedAdditional character
      >sets. Windows Unicode fonts like TNR and Verdana have the Pali vowels but
      >not the consonants, which are all in LatinExtendedAdditional:
      >
      >http://homepage.ntlworld.com/bpesala/clipboard/LatinExtendedAdditional.png
      >
      >Not all applications support Unicode yet, but it will come in time. The
      >current mish mash of incompatible fonts makes life difficult. If they had
      >used Unicode for the CSCD Tipitaka, conversion utilities would not have
      >been necessary. Ideally, we need to persuade VRI to bring out a Unicode
      >version. Unicode supports Devanagiri, Myanmar, Sinhalese, Thai, Khmer,
      >Mongolian, and Romanized Pali.
      >
      >You can find some ANSI and Unicode fonts on my website: The Titus Unicode
      >font is pretty comprehensive.
      >
      >http://www.aimwell.org/Fonts/fonts.html
      >
      >
      >
      >
      >- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
      >[Homepage] http://www.tipitaka.net
      >[Send Message] pali@yahoogroups.com
      >Paaliga.na - a community for Pali students
      >Yahoo! Groups members can set their delivery options to daily digest or web only.
      >Yahoo! Groups Links
      >
      >
      >
      >
      >
      >
      >
    • Phra Noah Yuttadhammo
      ... Dear Bhante Pesala, Greetings venerable sir :) I just want to let you and others know that Mr. Snow s utility DOES convert the files to Unicode. I just
      Message 2 of 8 , Jun 10, 2004
      • 0 Attachment
        >
        > I have it, but don't use it. What might be useful is a utility that
        > converted the files to Unicode.
        >

        Dear Bhante Pesala,

        Greetings venerable sir :) I just want to let you and others know that Mr. Snow's utility DOES convert the files to Unicode. I
        just downloaded the file cscdconv3.zip and successfully converted the files to UTF-8 Unicode.

        Best wishes,

        Yuttadhammo (Phra Noah)

        --------------------------------------------------------------------------------
        Chom Tong Insight Meditation Center
        Wat Phradhatu Sri Chom Tong Voravihara
        T. Ban Luang, A. Chom Tong
        Chiang Mai, Thailand, 50160
        Website: - www.sirimangalo.org
        Tel: (66 - int'l.) (0 - in Thailand) 53 342 184
        --------------------------------------------------------------------------------
      • Bhikkhu Pesala
        Thanks for the info. The utility is very simple to use, and as you say, it does already convert to Unicode or other formats. I could not believe how fast it
        Message 3 of 8 , Jun 11, 2004
        • 0 Attachment
          Thanks for the info. The utility is very simple to use, and as you say, it
          does already convert to Unicode or other formats. I could not believe how
          fast it was. Converting the Patisambhidamagga took about 2 seconds.

          It requires the Indic Times Font, which can be downloaded from:

          http://jbe.gold.ac.uk/fonts/tcrUnicode.TTF

          It views OK in Internet Explorer, but one cannot enlarge the font. Display
          in Opera is poor, though one can then zoom in. One would need to modify
          the stylesheet I suspect, though I wouldn't know how to do this.
        • Bhikkhu Pesala
          I figured out how to edit the style sheet. The problem lies with the Indic Times font. I also find 12pt rather small for body text, so I enlarged it to 16pt
          Message 4 of 8 , Jun 12, 2004
          • 0 Attachment
            I figured out how to edit the style sheet. The problem lies with the Indic
            Times font. I also find 12pt rather small for body text, so I enlarged it
            to 16pt and changed the font to my own Unicode Optimist. This is the
            revised style sheet, which is suitable for Opera or Internet Explorer.
            Copy the text and save it as tipitaka4.css after backing up the original.
            It needs to be in the same directory as the converted HTML files. You
            could replace "Unicode Optimist" with the font of your choice such as
            "TITUS Cyberbit Basic."

            /* This is the stylesheet for Roman UTF-8 Unicode encoding.
            */

            body { font-family:"Unicode Optimist";
            background:white; }

            SPAN {}
            .variant {color: blue}

            p {
            border-top: 0in; border-bottom: 0in;
            padding-top: 0in; padding-bottom: 0in;
            margin-top: 0in; margin-bottom: 0.5cm;
            }
            /* */
            .c01 { font-size: 16pt; text-indent: 2em; margin-bottom: 0em;}


            .c02 { font-size: 16pt;}


            .c03 { font-size: 16pt; text-indent: 2em;}

            /* Namo tassa, and nitthita -- no unique structural distinction */
            .c06 { font-size: 16pt; text-align:center;}

            /* Unindented text */
            .c07 { font-size: 16pt;}

            /* Book */
            .c10 { font-size: 21pt; text-align:center; font-weight: bold;}

            /* Sutta */
            .c11 { font-size: 18pt; text-align:center; font-weight: bold;}

            /* Nikaaya */
            .c12 { font-size: 24pt; text-align:center; font-weight: bold;}

            /* Section (above 14)*/
            .c13 { font-size: 16pt; text-align:center; font-weight: bold;}

            /* Section */
            .c14 { font-size: 16pt; text-align:center; font-weight: bold;}

            /* Gatha line */
            .c21 { font-size: 16pt; text-indent: 4em; margin-bottom: 0em;}

            /* Gatha line */
            .c22 { font-size: 16pt; text-indent: 7em; margin-bottom: 0.5cm;}

            /* Gatha line */
            .c26 { font-size: 16pt; text-indent: 4em; margin-bottom: 0em;}

            /* Gatha line */
            .c27 { font-size: 16pt; text-indent: 7em; margin-bottom: 0em;}


            /*DN Muula Structure
            {
            06 Namo tassa..
            12 Nikaaya
            10 Book
            }
            |
            |___11 Sutta
            | |
            | |___14 Section
            | |
            | |___ Text paras (01,02,03,07), gathas (21,22,26,27), various
            nitthitas (06)
            |
            |___11 Sutta
            (etc.)

            */
          Your message has been successfully submitted and would be delivered to recipients shortly.