Loading ...
Sorry, an error occurred while loading the content.
 

Re: Testing the 3ST

Expand Messages
  • Dave Gentile
    ... A correction to the quick calculation - I had the spreadsheet set for a 90th percentile confidence range, not 95th. I also needed to double the number I
    Message 1 of 24 , Dec 15, 2007
      --- In Synoptic@yahoogroups.com, "Dave Gentile" <gentile_dave@...>
      wrote:
      >
      > O.K. - a back of the envelop calculation (or really some quick
      > cutting and pasting with a spreadsheet) -
      >

      A correction to the quick calculation - I had the spreadsheet set for
      a 90th percentile confidence range, not 95th. I also needed to double
      the number I gave, for another reason. As a result, there is more like
      a 10% chance these numbers are just random chance (not 2.5% as
      previously stated). Appologies for the error.

      So the result seems significant at the 90th percentile, but just
      barely. However, this (combined with Ron's other observations) still
      suggests to me that sQ and xQ, by in large, are the result of two
      different processes.

      Dave Gentile
      Riverside, IL
    • Ron Price
      ... Dave, Thanks for your efforts, but you may need to find another envelope - should be plenty around at this time of year :-) ... Or another spreadsheet.
      Message 2 of 24 , Dec 16, 2007
        Dave Gentile wrote:

        > O.K. - a back of the envelop calculation

        Dave,

        Thanks for your efforts, but you may need to find another envelope - should
        be plenty around at this time of year :-)

        > (or really some quick cutting and pasting with a spreadsheet) -

        Or another spreadsheet.

        > xQ:
        >
        > 18 blocks
        > 1770 words
        > average length 98 words
        > 1602 possible 10 word agrements
        > .......
        > sQ:
        > 57 blocks
        > 2381 words
        > average length 42 words
        > 1881 possible 10 word agrements
        > 12 actual agreements

        Firstly, what I found was the set of strings common to Matthew and Luke
        having *more than* ten contiguous words, i.e. 11+
        Thus 1602 should be replaced by 1584 and 1881 by 1824.

        Secondly you appear to be comparing apples and pears in the agreements. The
        numbers 1584 and 1824 represent counts of the number of possible 11-word
        strings (some of which will be overlapping). What I had counted were the
        numbers and lengths of all the strings having more than ten words (none of
        which overlap with each other by definition). The total number of words in
        the xQ and sQ strings were 364 and 205 respectively. Therefore my actual
        numbers of 11-word strings (some of which will overlap) are 364 - 10*23 =
        134 and 205 - 10*12 = 85 respectively. So in xQ there are 134 contiguous
        11-word strings out of a possible 1584, and in sQ there are 85 contiguous
        11-word strings out of a possible 1824. (All this neglects the fact that the
        blocks have different lengths, but I agree that the approximation that they
        have equal lengths is unlikely to make much difference to the results.)

        Ron Price

        Derbyshire, UK

        Web site: http://homepage.virgin.net/ron.price/index.htm
      • Dave Gentile
        ... 23 actual agreement ... Luke ... Dave: O.K. I ll change the calculation from 10+ to 11+. I d expect this is a small effect. ... agreements. The ... 11-word
        Message 3 of 24 , Dec 17, 2007
          >
          > > xQ:
          > >
          > > 18 blocks
          > > 1770 words
          > > average length 98 words
          > > 1602 possible 10 word agrements
          23 actual agreement

          > > .......
          > > sQ:
          > > 57 blocks
          > > 2381 words
          > > average length 42 words
          > > 1881 possible 10 word agrements
          > > 12 actual agreements

          Ron:
          >
          > Firstly, what I found was the set of strings common to Matthew and
          Luke
          > having *more than* ten contiguous words, i.e. 11+
          > Thus 1602 should be replaced by 1584 and 1881 by 1824.
          >

          Dave:
          O.K. I'll change the calculation from 10+ to 11+. I'd expect this is
          a small effect.

          Ron:
          > Secondly you appear to be comparing apples and pears in the
          agreements. The
          > numbers 1584 and 1824 represent counts of the number of possible
          11-word
          > strings (some of which will be overlapping). What I had counted
          were the
          > numbers and lengths of all the strings having more than ten words
          (none of
          > which overlap with each other by definition).

          Dave:
          I had given that some thought. Counting that way seems to greatly
          inflate the significance, and I don't think it is correct, although
          granted I did not formulate a precise argument as to why it is
          correct or not. Done the way you suggest, you get something like
          99.999 percentile significance, which does not seem to be the right
          order of magnitude for the numbers we're dealing with. Plus,
          considering a few extreme cases leads to absurd looking conclusions.
          So, without precise argument, I conclude we should not count that
          way.

          Rather, I would put it this way - there are 1824 places a string
          could start, and 12 places one actually does start.

          Then using the revised numbers, the finding is significant at the
          89th percentile, just short of one typical arbitrary cut-off.
          Regardless, it still adds something when combined with your other
          arguments.

          Here I should also note that I used a Bayesian credibility interval,
          rather that a traditional confidence interval. They give nearly the
          same result, although they say something subtly different. But in
          this case if we are looking for that last 1%, the other method might
          give results more to our liking, or it might be slightly worse.

          Finally, one other potential problem - How was the "11+" criteria
          selected? Was that the first number you tried, or did you try other
          string length cutoffs first?

          Dave Gentile
          Riverside, IL
        • Ron Price
          ... Dave, Thanks for carrying out this investigation. ... Good question. I first tried 18+ and realized there were so few strings that the result was going to
          Message 4 of 24 , Dec 18, 2007
            Dave Gentile wrote:

            > Then using the revised numbers, the finding is significant at the
            > 89th percentile, just short of one typical arbitrary cut-off.
            > Regardless, it still adds something when combined with your other
            > arguments.

            Dave,

            Thanks for carrying out this investigation.

            > Finally, one other potential problem - How was the "11+" criteria
            > selected? Was that the first number you tried, or did you try other
            > string length cutoffs first?

            Good question. I first tried 18+ and realized there were so few strings that
            the result was going to be too sensitive to the choice of cut-off. I wanted
            to choose a cut-off which was significantly lower than 18+, yet not so low
            as to necessitate too much effort (my procedure being part computerized and
            part manual). It also had to be not too near 14 as I had already observed an
            apparently more-than-average number of strings of this length with known
            assignment, and didn't want the result to be biased. I had also by this
            stage determined to use a single computer run, for which (as it happens) an
            odd number cut-off was more 'efficient'. Hence the 11+.

            Ron Price

            Derbyshire, UK

            Web site: http://homepage.virgin.net/ron.price/index.htm
          Your message has been successfully submitted and would be delivered to recipients shortly.