Loading ...
Sorry, an error occurred while loading the content.

Re: [Indo-Eurasia] Re: Supposed deep ancestry of common "ultra conserved" words

Expand Messages
  • Piotr Gasiorowski
    ... The cognates are almost pure junk. For example, for mother , Starostin s database has two synonymous reconstructions , *VjV and *VmV (vowels be
    Message 1 of 21 , May 10, 2013
    • 0 Attachment
      W dniu 2013-05-10 05:57, richardwsproat pisze:
      > Well that, plus the fact that when you have a collection of
      > reconstructed forms for each family to choose from AND you allow
      > yourself some rather liberal notion of what "sounds like" means (e.g.
      > PIE "we-" vs. Proto-Kartvelian "cwen-"), you almost guarantee you'll
      > find some cognates.


      The "cognates" are almost pure junk. For example, for 'mother',
      Starostin's database has two synonymous "reconstructions", *VjV and *VmV
      (vowels be damned); for 'thou' (their only 7/7 case), at least three,
      and maybe as many as five "matches" should be disqualified for obvious
      reasons, for 'spit' we have possibly iconic word-shapes (and the IE one
      doesn't really match the others), etc., etc.

      Piotr
    • richardwsproat
      ... Yes, I understand the point here. But if at the end of the day all you have is 30 or 40 cognates then people could be forgiven for being skeptical. Of
      Message 2 of 21 , May 10, 2013
      • 0 Attachment
        --- In Indo-Eurasian_research@yahoogroups.com, "Richard Wordingham" <richard.wordingham@...> wrote:
        >

        >
        > > Of course you expect if you go back to 15K years BP, you won't find anything so convincing, so nobody can be faulted for not finding it. But that suggests to me that it probably isn't even worth trying, that there isn't much of a signal to be had, and no matter how fancy your instruments are, you aren't going to find much. You can make all sorts of pretty plots, and discuss fancy computations, but you are still shining a flashlight into thick fog.
        >
        > Try telling that to a radar man. The problem with long range comparison is that it is difficult to know if one has found cognates, or even to assign a reliability indicator to them. (I think a simple yes/no decision is unachievable.) What the Pagel paper demonstrates is multilateral comparison at work.
        >

        Yes, I understand the point here. But if at the end of the day all you have is 30 or 40 cognates then people could be forgiven for being skeptical. Of course in this case a lot of skepticism has already been voiced about the particular examples. But even if that weren't the case, and we had 30 to 40 cases that everyone could agree on, it seems to me that that would still not be enough to demonstrate any kind of real genetic relationship.

        I think the radar counterargument to the flashlight analogy does not work, because there is not going to be any radar. There is not going to be anything that can get around the fact that at those time depths too much muck is happening. Not that anyone has (AFAIK) raised it here, but the obvious analogy to radioactive decay would be a false analogy since in the latter case, there are rigid physical laws involved, and even contamination can in many cases be controlled for. Plus with physical phenomena there are often other *unrelated* physical phenomena one can use to calibrate.

        In rates of change in language there are only very rough guidelines, as a comparison as simple as Icelandic versus English shows. And what *independent* evidence is there? If there were archaeological/paleontological evidence that showed that the peoples involved lived together at one time; but how would you link up the bones and artifacts to the languages? The best example of such linkage I've seen is Anthony's "The Horse, the Wheel, and Language", and one of the realizations one gets out of that is just how complicated it is to link up archaeological evidence with specific language groups in preliterate cultures. And Anthony was not discussing time depths anything like as great as what we are dealing with here. Genetic evidence? We all know how unreliable that is when it comes to language relationships. Maybe time travel? Yeah that should do it.

        If there's any pattern here, it's that publications like Science and PNAS have become enamored of fancy computational techniques with many layers of assumptions and pretty looking plots. Granting that these kinds of things may make a publication *look* scientific, it's not the same thing as saying that we are really any closer to understanding things.

        --Richard
      • John Colarusso
        Nikolaev s and Starostin s work on Caucasian is like this too: absurdly abstract and sloppy. John Colarusso, Ph.D., Professor Anthropology, and Linguistics
        Message 3 of 21 , May 10, 2013
        • 0 Attachment
          Nikolaev's and Starostin's work on "Caucasian" is like this too: absurdly abstract and sloppy.

          John Colarusso, Ph.D.,
          Professor
          Anthropology, and
          Linguistics and Languages
          colaruss@...
          1-905-529-7070/23902

          On 2013-05-10, at 5:01 AM, Piotr Gasiorowski wrote:

          > W dniu 2013-05-10 05:57, richardwsproat pisze:
          > > Well that, plus the fact that when you have a collection of
          > > reconstructed forms for each family to choose from AND you allow
          > > yourself some rather liberal notion of what "sounds like" means (e.g.
          > > PIE "we-" vs. Proto-Kartvelian "cwen-"), you almost guarantee you'll
          > > find some cognates.
          >
          > The "cognates" are almost pure junk. For example, for 'mother',
          > Starostin's database has two synonymous "reconstructions", *VjV and *VmV
          > (vowels be damned); for 'thou' (their only 7/7 case), at least three,
          > and maybe as many as five "matches" should be disqualified for obvious
          > reasons, for 'spit' we have possibly iconic word-shapes (and the IE one
          > doesn't really match the others), etc., etc.
          >
          > Piotr
        • John Colarusso
          Some years back I came up with two simple models under conservative assumptions: one of word comparison and another of cognate loss. By combining the two it
          Message 4 of 21 , May 10, 2013
          • 0 Attachment
            Some years back I came up with two simple models under conservative assumptions: one of word comparison and another of cognate loss.

            By combining the two it emerged that at a horizon of 15 kyr BPE one could no longer distinguish cognates from chance similarities. The same horizon has emerged, I believe, in more sophisticated analyses where a Poisson distribution was used. I had a simple cumulative loss of vocabulary at 5% per millennium.

            Such consideration are supported by the recent kate by Vajda for a distant link between Yeniseyan and Na-Dene, at around 15 kyr. Na-Dene (with the exception of Eskimo-Aleut) is one of the most recent immigrant groups to NA. It falls just around the boundary for the limit of the comparative method. Vajda himself, much like Goddard before him with Ritwan, used morphological idiosyncrasies to supplement his conventional cognates.

            This limitation is why we have not and probably cannot prove links between any other Old World and New World groups.

            Ironically, the nonsense in Pagel et alia is a sort of validation of what we might call "meta-work" on the comparative method. It is sort of like the singularities in general relativity that emerge as part and parcel of the field equations. The comparative method if analyzed carefully, is a theory that predicts its own limitations.


            John Colarusso, Ph.D.,
            Professor
            Anthropology, and
            Linguistics and Languages
            colaruss@...
            1-905-529-7070/23902




            On 2013-05-10, at 4:11 AM, Richard Wordingham wrote:

            > --- In Indo-Eurasian_research@yahoogroups.com, "richardwsproat" <rws@...> wrote:
            >
            > > Well that, plus the fact that when you have a collection of reconstructed forms for each family to choose from AND you allow yourself some rather liberal notion of what "sounds like" means (e.g. PIE "we-" vs. Proto-Kartvelian "cwen-"), you almost guarantee you'll find some cognates.
            >
            > What the paper probes is whether the claimed cognates are in the right places. If a consistent process were applied to the hunt for cognates, the results would be very useful. Using very lax matching criteria might even compensate for the effects of word length - very common words are usually short, and I don't think that effect has been controlled for.
            >
            > > Of course you expect if you go back to 15K years BP, you won't find anything so convincing, so nobody can be faulted for not finding it. But that suggests to me that it probably isn't even worth trying, that there isn't much of a signal to be had, and no matter how fancy your instruments are, you aren't going to find much. You can make all sorts of pretty plots, and discuss fancy computations, but you are still shining a flashlight into thick fog.
            >
            > Try telling that to a radar man. The problem with long range comparison is that it is difficult to know if one has found cognates, or even to assign a reliability indicator to them. (I think a simple yes/no decision is unachievable.) What the Pagel paper demonstrates is multilateral comparison at work.
            >
            > > The weird thing is that in this case Atkinson might even be right. (Unlike the out-of-Africa serial founder effect paper, where he almost surely is wrong.) But how would you ever demonstrate that independently?
            >
            > If cognate recognition could be automated - and I'm not impressed by what I've seen - then one could randomise the vocabulary and compare the quality of the matches. That can leave the risk of false signals due to word length and phonosemantic effects. Word length can be dealt with by just comparing positions in the oral tract of initial consonants, but that might obscure the signal too much. Manaster Ramer has shown that this technique is good enough to pick up the relatedness of English and modern Indic languages, but the number of actual cognates rejected is distressingly high.
            >
            > Richard.
            >
          • Richard Wordingham
            ... *VjV is for one branch with mention of two possible cognates in other branches. The only intelligent selection is of *VmV (3 branches given). I think the
            Message 5 of 21 , May 10, 2013
            • 0 Attachment
              --- In Indo-Eurasian_research@yahoogroups.com, Piotr Gasiorowski <gpiotr@...> wrote:

              > The "cognates" are almost pure junk. For example, for 'mother',
              > Starostin's database has two synonymous "reconstructions", *VjV and
              > *VmV (vowels be damned);

              *VjV is for one branch with mention of two possible cognates in other branches. The only intelligent selection is of *VmV (3 branches given). I think the odds are good that PIE *ma:ter- is related, but the database has overlooked Pokorny's PIE *amma:. *VmmV is a pretty good bet for the ancestral language(s) 14,000 years ago - but it would be a pretty good bet if all we knew was that the language was not monosyllabic, which is your real point.

              > for 'thou' (their only 7/7 case), at least three,
              > and maybe as many as five "matches" should be disqualified for
              > obvious reasons,

              The counting does look sloppy. I wonder if the 'cognates' were not given because they suspected they had miscounted some of them. Sloppy encoding of data is not unknown in the generation of palaeontological cladograms.

              > for 'spit' we have possibly iconic word-shapes (and the IE one
              > doesn't really match the others), etc., etc.

              Interestingly, this is one of the meanings which was not expected to score many cognates.

              Richard.
            • Piotr Gasiorowski
              ... AJA , MAMA and ANA (modulo vowels) for mother can be found in numerous families on all continents. Hypothesis 1: they are nursery words borrowed
              Message 6 of 21 , May 10, 2013
              • 0 Attachment
                W dniu 2013-05-10 17:25, Richard Wordingham pisze:

                > *VjV is for one branch with mention of two possible cognates in other
                > branches. The only intelligent selection is of *VmV (3 branches given).
                > I think the odds are good that PIE *ma:ter- is related, but the database
                > has overlooked Pokorny's PIE *amma:. *VmmV is a pretty good bet for the
                > ancestral language(s) 14,000 years ago - but it would be a pretty good
                > bet if all we knew was that the language was not monosyllabic, which is
                > your real point.

                "AJA", "MAMA" and "ANA" (modulo vowels) for 'mother' can be found in
                numerous families on all continents. Hypothesis 1: they are nursery
                words borrowed into adult languages. Hypothesis 2: they reflect some
                real worldwide historical relationship(s). IMO only Hypothesis 1 beats
                the null hypothesis (completely random agreement).

                > The counting does look sloppy. I wonder if the 'cognates' were not given
                > because they suspected they had miscounted some of them. Sloppy encoding
                > of data is not unknown in the generation of palaeontological cladograms.

                Which of course does not justify slopiness in preparing data matrices
                for linguistic cladograms. Especially if cognacy judgements are based
                not even on real words but on reconstructions whose quality varies from
                decent (IE, but hell, they are based on Walde-Pokorny in this case!) to
                questionable (Altaic). Palaeontologists at least derive their data from
                real bones, not from reconstructed hypothetical ancestors.

                > > for 'spit' we have possibly iconic word-shapes (and the IE one
                > > doesn't really match the others), etc., etc.
                >
                > Interestingly, this is one of the meanings which was not expected to
                > score many cognates.

                But you can expect lookalikes. For some meanings, word-shapes seem to be
                under some sort of selective pressure, favouring iconicity, and so
                convergence can be expected. I'd argue that there's also selection
                against length and phonological complexity in function words (e.g. most
                pronominal roots are uniconsonantal).

                I think the article is a complete fiasco. Even the map there is
                something that an intelligent child would have made a better job of.
                Suffice it to say that the languages of Estonia, Moldova, Macedonia, and
                of a swath of Russia in the vicinity of St. Petersburg do not belong to
                any of the major Eurasiatic families... ah, yes, and Basque is
                apparently IE. What has PNAS degenerated into if they don't check such
                stuff before publication?

                Piotr
              • Trudy Kawami
                Just a question from a total outsider: Where would they place the ANE isolates like Elamite, Kassite & Hurrian. Or don t these old written tongues have any
                Message 7 of 21 , May 10, 2013
                • 0 Attachment
                  Just a question from a total outsider:
                  Where would they place the ANE isolates like Elamite, Kassite & Hurrian. Or don't these old written tongues have any cache?

                  Trudy S. Kawami, PhD
                  Director of Research
                  Arthur M. Sackler Foundation
                  461 East 57th Street
                  New York, NY 10022
                  212-980-5400 X25
                  www.arthurmsacklerfdn.org


                  -----Original Message-----
                  From: Indo-Eurasian_research@yahoogroups.com [mailto:Indo-Eurasian_research@yahoogroups.com] On Behalf Of Piotr Gasiorowski
                  Sent: Friday, May 10, 2013 4:24 PM
                  To: Indo-Eurasian_research@yahoogroups.com
                  Subject: Re: [Indo-Eurasia] Re: Supposed deep ancestry of common "ultra conserved" words

                  W dniu 2013-05-10 17:25, Richard Wordingham pisze:

                  > *VjV is for one branch with mention of two possible cognates in other
                  > branches. The only intelligent selection is of *VmV (3 branches given).
                  > I think the odds are good that PIE *ma:ter- is related, but the
                  > database has overlooked Pokorny's PIE *amma:. *VmmV is a pretty good
                  > bet for the ancestral language(s) 14,000 years ago - but it would be a
                  > pretty good bet if all we knew was that the language was not
                  > monosyllabic, which is your real point.

                  "AJA", "MAMA" and "ANA" (modulo vowels) for 'mother' can be found in numerous families on all continents. Hypothesis 1: they are nursery words borrowed into adult languages. Hypothesis 2: they reflect some real worldwide historical relationship(s). IMO only Hypothesis 1 beats the null hypothesis (completely random agreement).

                  > The counting does look sloppy. I wonder if the 'cognates' were not
                  > given because they suspected they had miscounted some of them. Sloppy
                  > encoding of data is not unknown in the generation of palaeontological cladograms.

                  Which of course does not justify slopiness in preparing data matrices for linguistic cladograms. Especially if cognacy judgements are based not even on real words but on reconstructions whose quality varies from decent (IE, but hell, they are based on Walde-Pokorny in this case!) to questionable (Altaic). Palaeontologists at least derive their data from real bones, not from reconstructed hypothetical ancestors.

                  > > for 'spit' we have possibly iconic word-shapes (and the IE one >
                  > doesn't really match the others), etc., etc.
                  >
                  > Interestingly, this is one of the meanings which was not expected to
                  > score many cognates.

                  But you can expect lookalikes. For some meanings, word-shapes seem to be under some sort of selective pressure, favouring iconicity, and so convergence can be expected. I'd argue that there's also selection against length and phonological complexity in function words (e.g. most pronominal roots are uniconsonantal).

                  I think the article is a complete fiasco. Even the map there is something that an intelligent child would have made a better job of.
                  Suffice it to say that the languages of Estonia, Moldova, Macedonia, and of a swath of Russia in the vicinity of St. Petersburg do not belong to any of the major Eurasiatic families... ah, yes, and Basque is apparently IE. What has PNAS degenerated into if they don't check such stuff before publication?

                  Piotr




                  ------------------------------------

                  Yahoo! Groups Links
                • Richard Wordingham
                  ... Perhaps they gave up as soon as they saw Alaska (85% English monoglot, at least in the home) marked as non-Indo-European. Mind you, at least Azerbaijan is
                  Message 8 of 21 , May 10, 2013
                  • 0 Attachment
                    --- In Indo-Eurasian_research@yahoogroups.com, Piotr Gasiorowski <gpiotr@...> wrote:

                    > Even the map there is something that an intelligent child would have
                    > made a better job of. <snip> What has PNAS degenerated into if
                    > they don't check such stuff before publication?

                    Perhaps they gave up as soon as they saw Alaska (85% English monoglot, at least in the home) marked as non-Indo-European. Mind you, at least Azerbaijan is now marked as Eurasiatic (though as IE when it is predominantly Altaic); 'Archaeology & Language' has it as 'Caucasian'. Also, compared to the book, Bhutan is now marked as not Indo-European.

                    Richard.
                  • Piotr Gasiorowski
                    ... To compensate for which, they make all of the Caucasus Kartvelian. Etc., etc., etc. Piotr
                    Message 9 of 21 , May 10, 2013
                    • 0 Attachment
                      W dniu 2013-05-11 00:02, Richard Wordingham pisze:

                      >
                      > 'Archaeology & Language' has it as 'Caucasian'.

                      To compensate for which, they make all of the Caucasus Kartvelian. Etc.,
                      etc., etc.

                      Piotr
                    • Richard Wordingham
                      ... That gives a(n optimistic) retention rate of 0.95**(2*15) * 100% = 21% of cognate pairs, so I think I m missing something. Are you saying that you
                      Message 10 of 21 , May 10, 2013
                      • 0 Attachment
                        --- In Indo-Eurasian_research@yahoogroups.com, John Colarusso <colaruss@...> wrote:
                        >
                        > Some years back I came up with two simple models under conservative assumptions: one of word comparison and another of cognate loss.

                        > By combining the two it emerged that at a horizon of 15 kyr BPE one could no longer distinguish cognates from chance similarities. The same horizon has emerged, I believe, in more sophisticated analyses where a Poisson distribution was used. I had a simple cumulative loss of vocabulary at 5% per millennium.

                        That gives a(n optimistic) retention rate of 0.95**(2*15) * 100% = 21% of cognate pairs, so I think I'm missing something. Are you saying that you wouldn't know whether an individual pair was a cognate pair or a chance similarity?

                        Statistical analyses can demonstrate the occurrence of non-chance correspondences, but won't say which correspondences are the non-chance ones. (Piotr has pointed out that common ancestry is not the only source of matches that are not truly random - the form of morphemes is not totally independent of meaning.) The existence of cognates can be detected more readily when there are multiple independent witnesses. Making a few minor approximations, the logarithm of the ratio of the probability of getting r matches (out of n languages) by cognate retention to the probability of getting r matches if the languages were unrelated is

                        r * log(probability of retention of individual item) minus
                        (r-1) * log(probability of pairwise mismatch)

                        So for probability of retaining a cognate pair = probability of pairwise false match = 0.125, the ratio of the probabilities for r = 4 comes out as 8 - a significant improvement on what can be done by pairwise comparison.

                        I've made a considerable number of simplifications and assumptions in deriving the formula above:

                        1) I've assume that word forms can be grouped in M equally probably buckets such that words in the same bucket will be adjudged cognate and words in different buckets will not.

                        2) Cognate words always appear in the same bucket

                        3) The probability of getting (r+1) matches is insignificant compared to the probability of getting r matches.

                        4) The probability of getting two different sets of r matches is negligible.

                        5) I've neglected probabilities of not retaining or not matching, treating them as 1.

                        6) I've not considered the possibility of the r matches being a mixture of cognates and independent random innovations. As the retention probability drops, these combinations may help detect the presence of cognates.

                        > Such consideration are supported by the recent kate by Vajda for a distant link between Yeniseyan and Na-Dene, at around 15 kyr. Na-Dene (with the exception of Eskimo-Aleut) is one of the most recent immigrant groups to NA. It falls just around the boundary for the limit of the comparative method. Vajda himself, much like Goddard before him with Ritwan, used morphological idiosyncrasies to supplement his conventional cognates.

                        Comparison in this case probably cannot be assisted much by multilateral comparison - Yeniseian and Na-Dene lack early branches that can provide independent testimony of Proto-Na-Dene-Yeniseian.

                        Richard.
                      • Richard Wordingham
                        ... The hypothesis that Elamite is related to Dravidian seems to have been rejected. I ve seen an analysis that concluded it may be related to Nostratic in
                        Message 11 of 21 , May 10, 2013
                        • 0 Attachment
                          --- In Indo-Eurasian_research@yahoogroups.com, Trudy Kawami <tkawami@...> wrote:
                          >
                          > Just a question from a total outsider:
                          > Where would they place the ANE isolates like Elamite, Kassite & Hurrian. Or don't these old written tongues have any cache?

                          The hypothesis that Elamite is related to Dravidian seems to have been rejected. I've seen an analysis that concluded it may be related to Nostratic in the wider sense, i.e. including Afroasiatic.

                          There's a hypothesis that Hurrian (and Urartian and Hattic) is related to NE Caucasian. There are suggestions that Kassite is related to Hurrian. Arnaud Fournet has been pushing the idea that Hurrian is related to Indo-European,but he hasn't persuaded me.

                          Richard.
                        • Trudy Kawami
                          My epigraphical colleague are very iffy on these ids as well. I find it of interest that there is not a clear record in the material remains of the ANE that
                          Message 12 of 21 , May 11, 2013
                          • 0 Attachment
                            My epigraphical colleague are very iffy on these ids as well. I find it of interest that there is not a clear record in the material remains of the ANE that can be correlated with changes in language. We cannot id Hurrian art (or even pots); the same goes for Hattic. Even the Elamites, though we can id a good deal of their art, are hard to differentiate from Sumerian & Akkadian speakers in Mesopotamia in many cases. So many different fish swimming in the human sea!
                            Trudy Kawami
                            ________________________________
                            From: Indo-Eurasian_research@yahoogroups.com [Indo-Eurasian_research@yahoogroups.com] on behalf of Richard Wordingham [richard.wordingham@...]
                            Sent: Friday, May 10, 2013 8:29 PM
                            To: Indo-Eurasian_research@yahoogroups.com
                            Subject: [Indo-Eurasia] Re: Supposed deep ancestry of common "ultra conserved" words



                            --- In Indo-Eurasian_research@yahoogroups.com<mailto:Indo-Eurasian_research%40yahoogroups.com>, Trudy Kawami <tkawami@...> wrote:
                            >
                            > Just a question from a total outsider:
                            > Where would they place the ANE isolates like Elamite, Kassite & Hurrian. Or don't these old written tongues have any cache?

                            The hypothesis that Elamite is related to Dravidian seems to have been rejected. I've seen an analysis that concluded it may be related to Nostratic in the wider sense, i.e. including Afroasiatic.

                            There's a hypothesis that Hurrian (and Urartian and Hattic) is related to NE Caucasian. There are suggestions that Kassite is related to Hurrian. Arnaud Fournet has been pushing the idea that Hurrian is related to Indo-European,but he hasn't persuaded me.

                            Richard.
                          Your message has been successfully submitted and would be delivered to recipients shortly.