Loading ...
Sorry, an error occurred while loading the content.
 

Pali by Numbers - 1

Expand Messages
  • Andy
    Hi! Some good news for beginning Pali students and Pali teachers? Over the last few months, I have been doing some research into word frequency in the Pali
    Message 1 of 7 , Apr 27, 2002
      Hi!

      Some good news for beginning Pali students and Pali teachers?

      Over the last few months, I have been doing some research into word
      frequency in the Pali Canon. Brother Jim at Aukana in England produced a
      list of unique word *forms* in the Pali Canon on the CSCD and the "count" or
      frequency for each of those unique word forms. (ie dhamma has a count,
      dhammena has a count, etc).

      I took that list, sorted it by frequency and discovered something amazing.
      The top 1000 word *forms* (exactly as you see them in the Pali Canon)
      account for 55% of all the words you will see in the Pali Canon.

      Oddly enough, many of these words do not appear at all in the free Pali
      courses for beginners.

      The Pali Canon on the CSCD has a total of approx. 2,700,000 words.

      The total number of unique word forms in the Pali Canon is 152,922.

      Total entries in the Paliwords dictionary: 20,119

      Here is a summary of the frequency breakdown of word *forms*:

      001 - 100 843,592 occurences (31%!)
      101 - 200 169,092
      201 - 300 109,124
      301 - 400 83,892
      401 - 500 66,760
      501 - 600 57,025
      601 - 700 48,312
      701 - 800 42,217
      801 - 900 37,467
      901 - 1000 33,361
      Top 1000 word forms total: 1,490,842 approx. 55% of all words.

      When I began my Pali studies, I became a little bit frustrated. I was
      studying lots of words and grammar, but I still had a lot of trouble
      actually reading the Pali texts. After a while, it occurred to me that the
      beginning Pali courses simply did not contain many of the "most common
      words" and "most common word forms".

      Obviously, if you study a word you wish to be certain that the word is used
      very frequently in the Pali Canon. By studying the word *forms* you can
      learn:
      a) important vocabulary
      b) important vocabulary exactly as you will see the word in the Pali Canon
      and
      c) the grammar of that word form in a "meaningful, useful and
      easy-to-remember" context.

      Personally, I am firmly convinced that this word study list is the "missing
      link" for beginning Pali students. After all, it's not hard to memorize 1000
      word *forms* and that is 55% of the words *as you see them* in the Pali
      Canon.

      Keep in mind: many of these words are *also* used in compound words (and the
      basewords occur using less common declensions). The 55% number does *not*
      include the use of these words in compound words and their use with less
      common declensions!

      So that you can get a look at the top 1000 word forms in the word list (and
      so that anyone can use it!), I would like to upload the list to the "Files"
      section as a MS-Works spreadsheet using the "LeedsBit PaliTranslit" font.
      The list has the word form and the count for that word form. Hopefully both
      teachers and students can use this list to "optimize" their Pali course
      work.

      I will post it as a spreadsheet so that people can easily sort it by "total
      word form occurences" or alphabetically. The file is 12K zipped and 30K
      unzipped.

      Would you like me upload the file? How can I do this?

      peace from

      Andy


      [Non-text portions of this message have been removed]
    • Robert Didham
      Hi Andy This list of word frequencies would be a real help to learners, and I would love to get a copy of it. It also demonstrates that Pali is a pretty
      Message 2 of 7 , Apr 27, 2002
        Hi Andy

        This list of word frequencies would be a real help to learners, and I would
        love to get a copy of it. It also demonstrates that Pali is a pretty
        typical language.

        However, just knowing the word is just a start - what it means and how it is
        used is fairly important information too. What could also be helpful,
        beyond a simple word list (and you may have already done this anyway) is to
        have detailed information on the *form* (case, gender, etc) as well as the
        root(s) so it can be easily traced in dictionaries for meanings etc. One of
        the difficulties for beginners is finding the correct entry in dictionaries.
        This may not help with selecting the appropriate meaning, or indeed
        working out the actual meaning from the suggestions in the dictionaries, for
        the passage in question, but it would be a quick and useful tool. How many
        of us have been saved grief by a quick look in Whitney's Verb Forms, for
        example, in Sanskrit?

        On the subject of useful references - you will also be aware of the work of
        Yamazaki and Ousaka on Pada indices. These help heaps in finding that
        elusive passage.

        Robert Didham



        >From: "Andy" <721910352@...>
        >Reply-To: Pali@yahoogroups.com
        >To: <Pali@yahoogroups.com>
        >Subject: [Pali] Pali by Numbers - 1
        >Date: Sat, 27 Apr 2002 07:34:15 -0700
        >
        >Hi!
        >
        >Some good news for beginning Pali students and Pali teachers?
        >
        >Over the last few months, I have been doing some research into word
        >frequency in the Pali Canon. Brother Jim at Aukana in England produced a
        >list of unique word *forms* in the Pali Canon on the CSCD and the "count"
        >or
        >frequency for each of those unique word forms. (ie dhamma has a count,
        >dhammena has a count, etc).
        >
        >I took that list, sorted it by frequency and discovered something amazing.
        >The top 1000 word *forms* (exactly as you see them in the Pali Canon)
        >account for 55% of all the words you will see in the Pali Canon.
        >
        >Oddly enough, many of these words do not appear at all in the free Pali
        >courses for beginners.
        >
        >The Pali Canon on the CSCD has a total of approx. 2,700,000 words.
        >
        >The total number of unique word forms in the Pali Canon is 152,922.
        >
        >Total entries in the Paliwords dictionary: 20,119
        >
        >Here is a summary of the frequency breakdown of word *forms*:
        >
        >001 - 100 843,592 occurences (31%!)
        >101 - 200 169,092
        >201 - 300 109,124
        >301 - 400 83,892
        >401 - 500 66,760
        >501 - 600 57,025
        >601 - 700 48,312
        >701 - 800 42,217
        >801 - 900 37,467
        >901 - 1000 33,361
        >Top 1000 word forms total: 1,490,842 approx. 55% of all words.
        >
        >When I began my Pali studies, I became a little bit frustrated. I was
        >studying lots of words and grammar, but I still had a lot of trouble
        >actually reading the Pali texts. After a while, it occurred to me that the
        >beginning Pali courses simply did not contain many of the "most common
        >words" and "most common word forms".
        >
        >Obviously, if you study a word you wish to be certain that the word is used
        >very frequently in the Pali Canon. By studying the word *forms* you can
        >learn:
        >a) important vocabulary
        >b) important vocabulary exactly as you will see the word in the Pali Canon
        >and
        >c) the grammar of that word form in a "meaningful, useful and
        >easy-to-remember" context.
        >
        >Personally, I am firmly convinced that this word study list is the "missing
        >link" for beginning Pali students. After all, it's not hard to memorize
        >1000
        >word *forms* and that is 55% of the words *as you see them* in the Pali
        >Canon.
        >
        >Keep in mind: many of these words are *also* used in compound words (and
        >the
        >basewords occur using less common declensions). The 55% number does *not*
        >include the use of these words in compound words and their use with less
        >common declensions!
        >
        >So that you can get a look at the top 1000 word forms in the word list (and
        >so that anyone can use it!), I would like to upload the list to the "Files"
        >section as a MS-Works spreadsheet using the "LeedsBit PaliTranslit" font.
        >The list has the word form and the count for that word form. Hopefully both
        >teachers and students can use this list to "optimize" their Pali course
        >work.
        >
        >I will post it as a spreadsheet so that people can easily sort it by "total
        >word form occurences" or alphabetically. The file is 12K zipped and 30K
        >unzipped.
        >
        >Would you like me upload the file? How can I do this?
        >
        >peace from
        >
        >Andy
        >
        >
        >[Non-text portions of this message have been removed]
        >
        >




        _________________________________________________________________
        Join the world�s largest e-mail service with MSN Hotmail.
        http://www.hotmail.com
      • bodhi2500
        I have seen upaadaanakkhanda translated in many ways ie. aggregates affected by clinging, aggregates that are a condition for clinging, the clinging
        Message 3 of 7 , Apr 27, 2002
          I have seen "upaadaanakkhanda" translated in many ways ie. aggregates
          affected by clinging, aggregates that are a condition for clinging,
          the clinging aggregates, aggregates subject to clinging etc.


          "Aggregates that are a condition for clinging" seems to me to best
          convey the meaning, thought it is a rather bulky translation.

          What do you think?

          Mettena
        • Piya Tan
          Andy, This is good news. The most important reason we all learn Pali I m sure is to understand the Pali Canon directly. I m studying and teaching Pali at the
          Message 4 of 7 , Apr 28, 2002
            Andy,

            This is good news. The most important reason we all learn Pali I'm sure is to
            understand the Pali Canon directly. I'm studying and teaching Pali at the same time,
            so your ideas are very welcome.

            Do send me your "word counts".

            Sukhi.

            P.

            Andy wrote:

            > Hi!
            >
            > Some good news for beginning Pali students and Pali teachers?
            >
            > Over the last few months, I have been doing some research into word
            > frequency in the Pali Canon. Brother Jim at Aukana in England produced a
            > list of unique word *forms* in the Pali Canon on the CSCD and the "count" or
            > frequency for each of those unique word forms. (ie dhamma has a count,
            > dhammena has a count, etc).
            >
            > I took that list, sorted it by frequency and discovered something amazing.
            > The top 1000 word *forms* (exactly as you see them in the Pali Canon)
            > account for 55% of all the words you will see in the Pali Canon.
            >
            > Oddly enough, many of these words do not appear at all in the free Pali
            > courses for beginners.
            >
            > The Pali Canon on the CSCD has a total of approx. 2,700,000 words.
            >
            > The total number of unique word forms in the Pali Canon is 152,922.
            >
            > Total entries in the Paliwords dictionary: 20,119
            >
            > Here is a summary of the frequency breakdown of word *forms*:
            >
            > 001 - 100 843,592 occurences (31%!)
            > 101 - 200 169,092
            > 201 - 300 109,124
            > 301 - 400 83,892
            > 401 - 500 66,760
            > 501 - 600 57,025
            > 601 - 700 48,312
            > 701 - 800 42,217
            > 801 - 900 37,467
            > 901 - 1000 33,361
            > Top 1000 word forms total: 1,490,842 approx. 55% of all words.
            >
            > When I began my Pali studies, I became a little bit frustrated. I was
            > studying lots of words and grammar, but I still had a lot of trouble
            > actually reading the Pali texts. After a while, it occurred to me that the
            > beginning Pali courses simply did not contain many of the "most common
            > words" and "most common word forms".
            >
            > Obviously, if you study a word you wish to be certain that the word is used
            > very frequently in the Pali Canon. By studying the word *forms* you can
            > learn:
            > a) important vocabulary
            > b) important vocabulary exactly as you will see the word in the Pali Canon
            > and
            > c) the grammar of that word form in a "meaningful, useful and
            > easy-to-remember" context.
            >
            > Personally, I am firmly convinced that this word study list is the "missing
            > link" for beginning Pali students. After all, it's not hard to memorize 1000
            > word *forms* and that is 55% of the words *as you see them* in the Pali
            > Canon.
            >
            > Keep in mind: many of these words are *also* used in compound words (and the
            > basewords occur using less common declensions). The 55% number does *not*
            > include the use of these words in compound words and their use with less
            > common declensions!
            >
            > So that you can get a look at the top 1000 word forms in the word list (and
            > so that anyone can use it!), I would like to upload the list to the "Files"
            > section as a MS-Works spreadsheet using the "LeedsBit PaliTranslit" font.
            > The list has the word form and the count for that word form. Hopefully both
            > teachers and students can use this list to "optimize" their Pali course
            > work.
            >
            > I will post it as a spreadsheet so that people can easily sort it by "total
            > word form occurences" or alphabetically. The file is 12K zipped and 30K
            > unzipped.
            >
            > Would you like me upload the file? How can I do this?
            >
            > peace from
            >
            > Andy
            >
            > [Non-text portions of this message have been removed]
            >
            >
            > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
            > Yahoo! Groups members can set their delivery options to daily digest or web only.
            > [Homepage] http://www.tipitaka.net
            > [Send Message] pali@yahoogroups.com
            > [Mailing List] http://groups.yahoo.com/group/pali
            > [Discussion] http://pub45.ezboard.com/btipitakanetwork
            >
            > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
          • ypong001
            Dear Andy, Piya and friends, I agree with the importance of knowing the most common words in the Pali canon. That will make the Tipitaka readable and
            Message 5 of 7 , Apr 29, 2002
              Dear Andy, Piya and friends,

              I agree with the importance of knowing the most common words in the
              Pali canon. That will make the Tipitaka "readable" and comprehensive,
              rather than having to refer to a dictionary every now and then.

              The file section is opened to all members for uploading. However, you
              *need* a Yahoo! account to do that, other than being a member of this
              group.

              Please send me a copy, Andy. Thank you.

              metta,
              Yong Peng.

              --- Piya Tan wrote:
              > Andy,
              >
              > This is good news. The most important reason we all learn Pali I'm
              sure is to understand the Pali Canon directly. I'm studying and
              teaching Pali at the same time, so your ideas are very welcome.
              >
              > Do send me your "word counts".
              >
              > Sukhi.
              >
              > P.
            • ������� ��������� (Dimitry Ivakhnenko)
              b I have seen upaadaanakkhanda translated in many ways ie. aggregates b affected by clinging, aggregates that are a condition for clinging, b the clinging
              Message 6 of 7 , Apr 30, 2002
                b> I have seen "upaadaanakkhanda" translated in many ways ie. aggregates
                b> affected by clinging, aggregates that are a condition for clinging,
                b> the clinging aggregates, aggregates subject to clinging etc.

                b> "Aggregates that are a condition for clinging" seems to me to best
                b> convey the meaning, thought it is a rather bulky translation.

                I would say "aggregates of clinging". This would convey several
                meanings:
                - aggregates being material support or fuel;
                - aggregates connected with clinging (as cause or object).

                There's nobody who clings, so aggregates are both cause and object.
                When clinging ceases, aggregates cease.

                The condition (paccayo) for 'upaadaana' is 'ta.nhaa'.
                Rather 'upaadaana' is a condition for 'khandhaa'.

                Mettena,
                Dimitry
              • bodhi2500
                Hi Dimitry Thank-you for the reply Dimitry. Aggregates of clinging sounds great. Anumodanaa. ... aggregates ... clinging, ... best
                Message 7 of 7 , Apr 30, 2002
                  Hi Dimitry
                  Thank-you for the reply Dimitry. "Aggregates of clinging"
                  sounds great.
                  Anumodanaa.


                  --- In Pali@y..., "Äìèòðèé Èâàõíåíêî (Dimitry Ivakhnenko)"
                  <sangha@i...> wrote:
                  > b> I have seen "upaadaanakkhanda" translated in many ways ie.
                  aggregates
                  > b> affected by clinging, aggregates that are a condition for
                  clinging,
                  > b> the clinging aggregates, aggregates subject to clinging etc.
                  >
                  > b> "Aggregates that are a condition for clinging" seems to me to
                  best
                  > b> convey the meaning, thought it is a rather bulky translation.
                  >
                  > I would say "aggregates of clinging". This would convey several
                  > meanings:
                  > - aggregates being material support or fuel;
                  > - aggregates connected with clinging (as cause or object).
                  >
                  > There's nobody who clings, so aggregates are both cause and object.
                  > When clinging ceases, aggregates cease.
                  >
                  > The condition (paccayo) for 'upaadaana' is 'ta.nhaa'.
                  > Rather 'upaadaana' is a condition for 'khandhaa'.
                  >
                  > Mettena,
                  > Dimitry
                Your message has been successfully submitted and would be delivered to recipients shortly.