3132Re: Statistical patterns

Expand Messages
• Sep 1, 1997
• 0 Attachment
On Mon, 1 Sep 1997, Timothy John Finney wrote:

> Here are the figures which I alluded to before:
>
> NOS = number of states = number of readings in a variation unit.
> FRQ = frequency = how often a given number of states occurs in the sampled
> variation units.
> FIT = fitted value using the equation F(n) = C x exp[-(a + bn)^2/2], C =
> 399, a = 1.50, b = 0.23.
>
> Hebrews + Romans (for variation units listed in the UBS 4th edn apparatus)
>
> NOS 1 2 3 4 5 6 7+
>
> FRQ ? 58 36 21 11 5 3
> FIT 89 58 36 21 12 6 6
>
> As you can see, the fit is quite good for 2 to 6 states.

Tim,

You do have an excellent fitting equation there. But with four parameters
at your disposal, (C, a, b, and the ^2 rather than some other power), and
only 5 or 6 data points, it had better fit pretty well!

The amount of data you have above is around 50% (though still less in one
case) of what the Gospel parallels provide for duplicate word strings. If
it were of comparable length, it's quite possible that if you were to
require a fit by a two-parameter curve, the simple exponential or
geometric progression would work better than anything else, once you
decided on just what the rules are for counting variants, etc.

But I don't see what gain in knowledge that would produce. Surely those
verses or sentences that exhibit an unusally large number of variants will
be relatively rare, if for no other reason than from definition of
"unusual." So you're almost bound to find some monotonically decreasing
curve that will approximate the distribution. In one case the very few
rare values of FRQ=1 or 2 for large n may occur at n = 8,9, and 11, say,
and in another at n = 8,10 & 13, say -- this I have been referring to as
"sampling error."

Although a similar statement could be said of the Gospels' duplicate
word-string parallels, there we run into the peculiarity that what occurs
in the region you labelled "7+" exhibits far too many occurrences to be at
all consonant with the monotonic fall-off shown for n=3,4,5,6.

After finding zero occurrences for n = 8 or 10 or 12, do you then notice
one or two instances of an occurrence for n=13 and another for n=16 and
17? If so, I think you'd be interested in knowing why -- was the sentence
so difficult to understand, relative to almost all other sentences in the
gospel/book, that it caused a huge number of variants? Or was the grammar
of the sentence so bad in the earliest ms that it caused later dependent
mss to correct it in many various ways? If so, why were these anomalous
sentences so much more anomalous than others that it caused a disruption
in the monotonic decrease of FRQ with n, with an upturn after many zeroes?
Or can such anomalies be simply explained as inevitable sampling error?

With the Gospels' duplicate word-string parallels, one can also examine
the anomalies, with the importance being that one can see if they fit into
any proposed solutions of the Synoptic Problem -- bolster one and rule
others out. Although this latter problem may not be of particular
interest to TC, I still wonder if TC can hazard any guesses as to whether
or not harmonistic corruption could have caused the anomalies in the
frequency distribution at large values of n or "I," as I called it.

Jim Deardorff
• Show all 9 messages in this topic