Loading ...
Sorry, an error occurred while loading the content.

Re: [Pali] Software: Basic Pali text processing subroutines?

Expand Messages
  • Nina van Gorkom
    Dear Jon F and friends, ... N:I have a Mac OS X, and use for my Pali: PCharter, but I noticed that the diacritical signs did not work when sending a Word
    Message 1 of 18 , Jun 14, 2009
    • 0 Attachment
      Dear Jon F and friends,
      Op 14-jun-2009, om 13:57 heeft Jon Fernquest het volgende geschreven:

      > 1. Converting string of Pali letters from unicode to Velthius, and
      > vice versa.
      -------
      N:I have a Mac OS X, and use for my Pali: PCharter, but I noticed
      that the diacritical signs did not work when sending a Word
      attachment to another computer. The signs did not come out.
      I do not easily understand technical instructions. I guess it must be
      PCharter but hard for me to change it. I use it in all my writings.
      Nina.



      [Non-text portions of this message have been removed]
    • Ong Yong Peng
      Dear Jon and Nina, 1. there are a few scripts around which you can refer to: (a) JS - http://www.library.websangha.org/earlybuddhism/convertpad.htm (b) PHP -
      Message 2 of 18 , Jun 14, 2009
      • 0 Attachment
        Dear Jon and Nina,

        1. there are a few scripts around which you can refer to:
        (a) JS - http://www.library.websangha.org/earlybuddhism/convertpad.htm
        (b) PHP - http://www.tipitaka.net/forge/index.php?article=velthuis2unicode

        2&3. Programming languages such as Perl and Ruby have strong parsing and text handling capabilities that may lead one to think it is a simple task. However, with my understanding of Pali inflections, I appreciate its simplicity that most verbs and nouns follow simple rules to express usages (case, tense, number, person), but there is a long list of exceptions. Further, identifying grammatical gender (of nouns) and conjugational group (of verbs) is not an easy task. As for Sandhi, it is even more challenging, given the fact that any two arbitrary words can be joined. I find this an interesting idea, and please do let us know when you make progress.

        metta,
        Yong Peng.


        --- In Pali@yahoogroups.com, Jon Fernquest wrote:

        Just want to inquire whether anyone has written subroutines for doing basic processing with Pali language. Including perhaps:

        1. Converting string of Pali letters from unicode to Velthius, and vice versa.
        2. Joining two Pali letter strings according to Sandhi rules.
        3. Taking a root form of a noun or verb and inflecting it, adding endings to it, etc etc...
      • Nina van Gorkom
        Dear Yong Peng, thank you for your help. I sent it on to a friend and will discuss it with him, perhaps it will dawn on me. Nina. ... [Non-text portions of
        Message 3 of 18 , Jun 14, 2009
        • 0 Attachment
          Dear Yong Peng,
          thank you for your help. I sent it on to a friend and will discuss it
          with him, perhaps it will dawn on me.
          Nina.
          Op 14-jun-2009, om 16:04 heeft Ong Yong Peng het volgende geschreven:

          > 1. there are a few scripts around which you can refer to:
          > (a) JS - http://www.library.websangha.org/earlybuddhism/convertpad.htm
          > (b) PHP - http://www.tipitaka.net/forge/index.php?
          > article=velthuis2unicode



          [Non-text portions of this message have been removed]
        • Ong Yong Peng
          Dear Nina, allow me to offer some opinion. I believe your problem has to do with the fonts. You have to ensure that the two computers have the same fonts
          Message 4 of 18 , Jun 15, 2009
          • 0 Attachment
            Dear Nina,

            allow me to offer some opinion. I believe your problem has to do with the fonts. You have to ensure that the two computers have the same fonts installed. The best is to use Unicode fonts nowadays. Hope that helps.

            metta,
            Yong Peng.


            --- In Pali@yahoogroups.com, Nina van Gorkom wrote:

            I have a Mac OS X, and use for my Pali: PCharter, but I noticed that the diacritical signs did not work when sending a Word attachment to another computer. The signs did not come out. I do not easily understand technical instructions. I guess it must be PCharter but hard for me to change it. I use it in all my writings.
          • Nina van Gorkom
            Dear Yong Peng, It is difficult to know the font at the other side, in Thailand, in Malaysia. But Mike I sent it to will experiment and offer me more
            Message 5 of 18 , Jun 15, 2009
            • 0 Attachment
              Dear Yong Peng,
              It is difficult to know the font at the other side, in Thailand, in
              Malaysia. But Mike I sent it to will experiment and offer me more
              explanations.
              Many thanks,
              Nina.
              Op 15-jun-2009, om 13:49 heeft Ong Yong Peng het volgende geschreven:

              > I believe your problem has to do with the fonts. You have to ensure
              > that the two computers have the same fonts installed. The best is
              > to use Unicode fonts nowadays. Hope that helps.



              [Non-text portions of this message have been removed]
            • kamleong_lai
              To add on for Q#1, the is another PHP tool at http://www.tipitaka.net/forge/index.php?article=velthuis2unicode Those are tools for converting text from one
              Message 6 of 18 , Jun 16, 2009
              • 0 Attachment
                To add on for Q#1, the is another PHP tool at http://www.tipitaka.net/forge/index.php?article=velthuis2unicode

                Those are tools for converting text from one transliteration scheme to another. It is rather easy to implement with a mapping table and some simple rules for the conversion. (In fact, I have also developed one for transliteration between devanagari to the romanized characters using javascript)

                However, for Q#2&3, as mentioned by YongPeng, it indeed involves more sophisticated algorithm, especially when it may involve a number of exceptions. Nonetheless, it will definitely be a very useful tool which I am looking forward too.

                best regards.

                --- In Pali@yahoogroups.com, "Ong Yong Peng" <palismith@...> wrote:
                >
                > Dear Jon and Nina,
                >
                > 1. there are a few scripts around which you can refer to:
                > (a) JS - http://www.library.websangha.org/earlybuddhism/convertpad.htm
                > (b) PHP - http://www.tipitaka.net/forge/index.php?article=velthuis2unicode
                >
                > 2&3. Programming languages such as Perl and Ruby have strong parsing and text handling capabilities that may lead one to think it is a simple task. However, with my understanding of Pali inflections, I appreciate its simplicity that most verbs and nouns follow simple rules to express usages (case, tense, number, person), but there is a long list of exceptions. Further, identifying grammatical gender (of nouns) and conjugational group (of verbs) is not an easy task. As for Sandhi, it is even more challenging, given the fact that any two arbitrary words can be joined. I find this an interesting idea, and please do let us know when you make progress.
                >
                > metta,
                > Yong Peng.
                >
                >
                > --- In Pali@yahoogroups.com, Jon Fernquest wrote:
                >
                > Just want to inquire whether anyone has written subroutines for doing basic processing with Pali language. Including perhaps:
                >
                > 1. Converting string of Pali letters from unicode to Velthius, and vice versa.
                > 2. Joining two Pali letter strings according to Sandhi rules.
                > 3. Taking a root form of a noun or verb and inflecting it, adding endings to it, etc etc...
                >
              • Patrick Hall
                Hello folks, ... And here s one in Python that I wrote: http://github.com/amundo/palihack/blob/eef9871f87fd336cbfa63044c6b6184f1c5f08cd/velthuis.py No claims
                Message 7 of 18 , Jun 17, 2009
                • 0 Attachment
                  Hello folks,

                  On Tue, Jun 16, 2009 at 9:54 AM, kamleong_lai<kamleong_lai@...> wrote:

                  > Those are tools for converting text from one transliteration scheme to
                  > another. It is rather easy to implement with a mapping table and some simple
                  > rules for the conversion. (In fact, I have also developed one for
                  > transliteration between devanagari to the romanized characters using
                  > javascript)

                  And here's one in Python that I wrote:

                  http://github.com/amundo/palihack/blob/eef9871f87fd336cbfa63044c6b6184f1c5f08cd/velthuis.py

                  No claims to efficiency, but it seems to work.

                  Metta,
                  -Pat
                • Jon Fernquest
                  Yong Peng wrote: Programming languages such as Perl and Ruby have strong parsing and text handling capabilities that may lead one to think it is a simple
                  Message 8 of 18 , Jun 18, 2009
                  • 0 Attachment
                    Yong Peng wrote: "Programming languages such as Perl and Ruby have strong parsing and text handling capabilities that may lead one to think it is a simple task. However, with my understanding of Pali inflections, I appreciate its simplicity that most verbs and nouns follow simple rules to express usages (case, tense, number, person), but there is a long list of exceptions. Further, identifying grammatical gender (of nouns) and conjugational group (of verbs) is not an easy task. As for Sandhi, it is even more challenging, given the fact that any two arbitrary words can be joined. I find this an interesting idea, and please do let us know when you make progress."

                    Thank you for all the info everyone gave and sorry for the delay.

                    1. I am aiming to generate language rather than parse language (at first) which is a lot easier than parsing.

                    2. I am aiming to store words and pieces of words in Ruby objects rather than as raw strings. For each type of object (phoneme, morph, word) there are a set of methods given by the rules of phonology and grammar. Some aspects such as Sandhi that require complicated rules and exceptions should be table driven, driven from a table of rules.

                    3. First goal is online dictionary with a lot more coverage than Buddhata's with a scrolling display that shows a word next to its alphabetrical neighbors:

                    http://www.dicts.info/dictionary.php?k1=1&k2=442

                    4. How to extend existing dictionaries: Starting with the entries of Buddhadata's Pali-English dictionary, generate possible forms (decline nouns, conjugate verbs, etc) and then find and check off the generated words in the "list of all Pali words found in the Tipitaka" file.

                    5. Hope to use the programming as a way of learning about aspects of Pali that I have been too afraid to look under the hood and investigate because of complicated rules (like i have ignored internal sandhi, phonology, and derivation of basic forms from roots). Stepping through the rules with a computer program, allows one an opportunity to grasp the complex nature of the rules and how they generate language.

                    6. Roderick Bucknell's Sanskrit manual reduces generation of Sanskrit to operations on tables, seems like a good approach to emulate, but is a little short on sandhi (is there an exhaustive list or study of Pali Sandhi somewhere?).

                    7. Search simplification: when searching against the Tipitaka corpus, give root of word, generate forms (e.g. decline noun), then do search with all forms.

                    Thanks,

                    Jon Fernquest
                  • Ong Yong Peng
                    Dear Jon and Pat, thanks, Jon, for sharing your programming thoughts. Given your background in both linguistics and programming, I am sure you can come up with
                    Message 9 of 18 , Jun 19, 2009
                    • 0 Attachment
                      Dear Jon and Pat,

                      thanks, Jon, for sharing your programming thoughts. Given your background in both linguistics and programming, I am sure you can come up with a good solution.

                      I have a copy of Rod Bucknell's book in Singapore, which I have not really gone through. Other than a book collector, I bought the book, which is on Sanskirt not Pali, also because I know Rod personally. ;-) Limited preview of this book is available at Google Books: http://books.google.com


                      metta,
                      Yong Peng.
                    • Jon Fernquest
                      Dear YOng Peng and Pat; Today starting Pali programming in Ruby with baby steps using the basic grammar and vocab of Buddhadata s New Pali Course volume I and
                      Message 10 of 18 , Jun 19, 2009
                      • 0 Attachment
                        Dear YOng Peng and Pat;

                        Today starting Pali programming in Ruby with baby steps using the basic grammar and vocab of Buddhadata's New Pali Course volume I and Narada to create a verb conjugator and noun decliner.

                        Gerard Huet has something like this: "[an] interface [that] gives the declension tables for Sanskrit substantives. Try out this declension engine by submitting Sanskrit stems with intended gender."
                        http://sanskrit.inria.fr/

                        Thanks Pat for the Velthius.py script.

                        With metta,
                        Jon Fernquest
                      • Ong Yong Peng
                        Dear Jon, thank you for the link. I tried it, and it s interesting. A tool like this for Pali should be fun too. I just recall that at the back of Pali
                        Message 11 of 18 , Jun 22, 2009
                        • 0 Attachment
                          Dear Jon,

                          thank you for the link. I tried it, and it's interesting. A tool like this for Pali should be fun too. I just recall that at the back of Pali Workbook (published by VRI), there is a list of "suffixes" for declension, conjugation, etc. It may be worth a look.

                          http://www.pariyatti.org/Bookstore/productdetails.cfm?PC=769

                          I note that many websites have been producing interesting materials for Buddhist studies, but hardly any truly leverage the power of the available technology. It is my aim to develop tipitaka.net into a site which facilitates interactive learning and collaborative study of the Pali language and the Tipitaka, via innovative web design.

                          My target audience will, for a while, be self-study beginners, like myself 10 years ago. For the more advance students, I believe the best mode of learning is a high degree of discussion and interaction, like what we are doing on this list, although a few advance learning tools would be useful.

                          I will update the group when I make some progress on this, it is long overdue to provide enhanced features to existing pages on tipitaka.net. However, I wouldn't want to type pages of my ideas and end up not accomplishing any. ;-)

                          Btw, I was once thinking about automatically sorting Pali words by its own collation sequence: a aa i ii u uu e o k kh g gh `n c ch j jh ~n .t .th .d .dh .n t th d dh n b bh p ph m y r l v s h .l .m

                          I have been sitting on it for a while, knowing that things are improving in many areas, and the effort to develop such a subroutine in the future will be easier than now. Even with Unicode, the program will have to be able to handle the aspirates kh, gh, ch, jh, .th, .dh, th, dh, bh and ph as single entities not two separate characters. I do have a rough solution in mind, but haven't got the time to implement it in PHP. Let me know if you have any idea.

                          metta,
                          Yong Peng.


                          --- In Pali@yahoogroups.com, Jon Fernquest wrote:

                          Today starting Pali programming in Ruby with baby steps using the basic grammar and vocab of Buddhadata's New Pali Course volume I and Narada to create a verb conjugator and noun decliner.

                          Gerard Huet has something like this: "[an] interface [that] gives the declension tables for Sanskrit substantives. Try out this declension engine by submitting Sanskrit stems with intended gender."
                          http://sanskrit.inria.fr/
                        • kamleong_lai
                          Perhaps we can also take some advantages from the Digital Pali Reader project at http://sourceforge.net/projects/digitalpali/
                          Message 12 of 18 , Jun 24, 2009
                          • 0 Attachment
                            Perhaps we can also take some advantages from the "Digital Pali Reader" project at http://sourceforge.net/projects/digitalpali/
                          • Piya Tan
                            Thanks Kam Leong, for this wonderful reminder regarding Yuttadhammo s DPR. It would be really great if it could include the CPD which is also digital now. I m
                            Message 13 of 18 , Jun 25, 2009
                            • 0 Attachment
                              Thanks Kam Leong,

                              for this wonderful reminder regarding Yuttadhammo's DPR.

                              It would be really great if it could include the CPD which is also digital
                              now.

                              I'm happy he has also a Syamrattha Tipitaka version, something I need.

                              With metta,

                              Piya

                              On Thu, Jun 25, 2009 at 11:33 AM, kamleong_lai <kamleong_lai@...>wrote:

                              >
                              >
                              > Perhaps we can also take some advantages from the "Digital Pali Reader"
                              > project at http://sourceforge.net/projects/digitalpali/
                              >
                              >
                              >



                              --
                              The Minding Centre
                              Blk 644 Bukit Batok Central #01-68 (2nd flr)
                              Singapore 650644
                              Tel: 8211 0879
                              Meditation courses & therapy: http://themindingcentre.googlepages.com
                              Website: dharmafarer.googlepages.com


                              [Non-text portions of this message have been removed]
                            • Ong Yong Peng
                              Dear Kam Leong and Piya, thank you. Yes, the CPD is now online, it is still a work in progress, with plans to complete Volume 3 by reaching the end of letter
                              Message 14 of 18 , Jun 26, 2009
                              • 0 Attachment
                                Dear Kam Leong and Piya,

                                thank you. Yes, the CPD is now online, it is still a work in progress, with plans to complete Volume 3 by reaching the end of letter K.

                                http://pali.hum.ku.dk/cpd/intro/notice_about_development_vol3.html

                                The CPD is no doubt a great tribute from the international Pali scholarship.

                                Kam Leong, I have heard good feedback from DPR, but I have not explored the software myself. Currently, for our sutta translation exercises, I am still using CSCD3 (CD-ROM) [recently, I am moving towards ] and the online PTS PED. If I am in Singapore, I will have a collection of resources in print too. However, I believe I should "catch up" with technology too, and try out DPR and CST4.

                                http://www.tipitaka.org/cst4

                                Thanks should go to Ven. Yuttadhammo for DPR, and Frank Snow for CST4.

                                For those who are new, the reference collection in DPR includes

                                PED: Pali English Dictionary
                                DPPN: Dictionary of Pali Proper Names

                                Both from the Pali Text Society (PTS), and

                                CPED: Concise Pali English Dictionary by Ven. A P Buddhadatta.

                                It would be great to learn about progress in other works too, such as, the DOP (Dictionary of Pali) by Margaret Cone. Volume 1 of DOP ends in the letter Kh, and there is no news about any upcoming volumes on PTS website. The other would be Bhikkhu Bodhi's new translation of Anguttara Nikaya.

                                For my recent comments on tipitaka.net, I am only building tools mainly for the site, on the browser platform. These tools fall into three main categories:

                                1. general applications: e.g. Velthuis-Unicode converter
                                2. learning enhancement: simple XML-based tools, which I am currently working on
                                3. data processing: e.g. Pali Scope!

                                The time I spend on these is very irregular, and I do not just focus on one particular category. At the time, I am dedicating more towards improving the learning experience for new Pali students, an area which I am more familiar with since I have only recently been through it. Again, I am not going to write pages on this, but when there is some progress, I will always seek feedback from the group. I am taking a break from the more "heavy" projects, like Pali Scope!, Sutta Spectra! and also a "in-the-pipeline" Pali Scribe project. These are heavily collaborative-type projects, which I have to delay to focus on the easier tasks first.

                                Also, the web platform is my preference at the moment. I would only consider moving to additional platforms when I have more time, and also when I am more established, both personally (in my personal life, that is) and financially.

                                Btw, Kam Leong, do you do much programming? How about a simple intro of yourself?

                                metta,
                                Yong Peng.


                                --- In Pali@yahoogroups.com, Piya Tan wrote:

                                It would be really great if it could include the CPD which is also digital now.

                                > Perhaps we can also take some advantages from the "Digital Pali Reader" project at http://sourceforge.net/projects/digitalpali/
                              • kamleong_lai
                                Hi Yong Peng, All, Basically, I have been working in IT for 12 years now since I graduated. I do programming at work, mostly developing web-based solution
                                Message 15 of 18 , Jun 28, 2009
                                • 0 Attachment
                                  Hi Yong Peng, All,

                                  Basically, I have been working in IT for 12 years now since I graduated. I do programming at work, mostly developing web-based solution (Microsoft IIS, ASP/VBScript, HTML/JavaScript, Oracle ERP/PLSQL) for the company internal used. Besides that, I do scripting in Perl, Unix/Linux Shell, and some VB + C/C++ programming as well.

                                  I have not taken refuge in the Triple Gem, and I do not actually live as a "proper" Buddhist. Anyway, I am interested in the area of academical/historical & comparative studies of cultures & religions, especially in Buddhism. As a typical "Chinese educated" person in Malaysia, I can read/write/speak English, Chinese & Malay language. I can also understand Classical Chinese to some extent.

                                  The above is my simple intro. I am here to learn more about the Pali language and the Buddhist canon. I am glad if I can offer some useful information or help to others in this group.

                                  regards.

                                  --- In Pali@yahoogroups.com, "Ong Yong Peng" <palismith@...> wrote:
                                  > Btw, Kam Leong, do you do much programming? How about a simple intro of yourself?
                                  >
                                  > metta,
                                  > Yong Peng.
                                  >
                                • Jon Fernquest
                                  Hi kamleong lai; Thanks for the intro. (My intro below) If you could comment on the general design of classes and the way things are done in some Pali grammar
                                  Message 16 of 18 , Jun 28, 2009
                                  • 0 Attachment
                                    Hi kamleong lai;

                                    Thanks for the intro. (My intro below)

                                    If you could comment on the general design of classes and the way things are done in some Pali grammar programs that would be great.

                                    I am using Ruby On Rails mainly because for web apps the MVC design pattern it uses makes web apps simple. I've messed around with clunky PHP systems before. Want to keep things simple this time. You've used Perl which as a scripting language is pretty similar to Ruby.

                                    One web app I hope to write in Ruby On Rails is a automated grammar quiz system. For example, display four noun phrases, let's say a number plus a noun, since the number has to agree with a noun, one choice is right, the others don't disagree. If Pali grammatical info is embedded in a program this is the sort of thing one can do. Could also generate many simple interlinear translations like the ones at this site, the most useful Pali learning resource that I know of.

                                    Also in the process of embedding Pali grammar in a program, one must attend to all the details one skips when one is studying the language from the top down. For example, sandhi seems to have a role in almost all grammar, even basic noun declension and exceptions can be explained by it, it seems.

                                    I worked in IT for about 8 years after university but clunky messy high priced business applications like accounting, payroll, and inventory systems written in, ugh, Cobol and some C. But also have knowledge of nicer programming languages like Lisp from grad school. Discovered scripting languages many years later and in my work for a newspaper in Thailand I have written PHP customizations for Moveable Type web publishing system.

                                    I have taken refuge in the Three Jewels but I like to drink beer with my friends, a violation of five precepts.

                                    With metta,
                                    Jon
                                  • Ong Yong Peng
                                    Dear Kam Leong and Jon, thanks for your introductions. It s simply wonderful to have many computer programmers on a Buddhist group! Kam Leong, while I do
                                    Message 17 of 18 , Jul 2, 2009
                                    • 0 Attachment
                                      Dear Kam Leong and Jon,

                                      thanks for your introductions. It's simply wonderful to have many computer programmers on a Buddhist group!

                                      Kam Leong, while I do expect an IT professional to know several languages and platforms, your list is still impressive. ;-)

                                      Thanks for your offer to help. Please do provide advice where and when you can to queries from members in regards to general questions on Pali computer applications.

                                      I foresee tipitaka.net to roll out programming projects in the future, mainly to bring the website to "the next level". I will make announcements to the group when we are ready. Detailed discussion of tipitaka.net programming projects will have to take place on another list, unfortunately, so as to keep this list for its main objectives.

                                      metta,
                                      Yong Peng.


                                      --- In Pali@yahoogroups.com, kamleong_lai wrote:

                                      Basically, I have been working in IT for 12 years now since I graduated. I do programming at work, mostly developing web-based solution (Microsoft IIS, ASP/VBScript, HTML/JavaScript, Oracle ERP/PLSQL) for the company internal used. Besides that, I do scripting in Perl, Unix/Linux Shell, and some VB + C/C++ programming as well.
                                    Your message has been successfully submitted and would be delivered to recipients shortly.