 Hi all,
I've been following Parallelmania (as I think it has very cleverly been dubbed) with some interest, and I have a quick question, probably mostly for Mike, but certainly not exclusively: what does the existence of a gap in the "Markan" (and I use scare quotes because I have no interest in giving Mark any priority here) material in Thomas signify? Or perhaps, more realistically: what are we trying to argue in identifying such a gap? For those who argue that Thomas is dependent on the synoptic Gospels I suppose this might be of some interest, but for those like myself who agree with Patterson, DeConcik and the like that Thomas is independent of the New Testament, what are we trying to say? Does the "Markan" material show significant ideological discrepancies from the "Q" material and the distinctly Thomas material? If so this would certainly be of interest, but otherwise I'm not quite sure what our end is. I don't want to discourage this kind of comparison, but it seems to me that unless we're arguing for literary dependence it might be helpful to identify ideological differences in the "Markan" material that may have caused it to be absent from a significant chunk of Thomas.
I hope this serves to open discussion rather than close it off, and I hope this functions as an occasion for thought rather than an dismissal of what I think could be a very interesting project.
ian  Hi Mike,
Thanks for the spreadsheet, but it's not quite the same thing.
http://groups.yahoo.com/group/gthomas/files/T5G.pdf
I didn't list Q, since it's a such a matter of opinion, and some even don't think it exists. (I don't think so, as I think the dead even distribution of Matthew/Luke in Thomas demonstrates.) In discussions I have tried to carefully define it objectively as Matthew and Luke, but not Mark.
And I didn't list their opinion of the source of the parallel, such as #9 coming from Mark. It assumes what you are trying to prove. I could mention it as a footnote that T5G thinks it comes from Mark, certainly valuable information, but my intention is to stay as objective as possible, and show all the parallels and let the reader decide for themselves.

In the past, I did the calculation of the probability of that Mark gap for T5G parallels, T5G parallels and Cf's, and the even more loose, "Funk" parallels, and the math came out about the same, like it did by saying vs. by subsaying. It's really there, and it's really a one in a hundred chance at Las Vegas that it would be there by random luck.
I have an improved version of the equation, an approximation not an inequality, maybe I'll test it with your chart.

"He who has ears, let them hear" is tough, I don't think it should be included, and just pretended it doesn't exit, but I worry about it, because it's a special rule applying to nothing else. Not written in stone it doesn't come from Jesus and the pattern in the Synoptics doesn't tell us something. I don't know if there is a right or wrong.
Yes, I need to update that chart, but I'm thinking more in terms of Flash, where the parallel pops up when you hover over it or something. And I need to input the data first.
In the meantime, it has utility, such as 75% of Funk's Jewish/Hebrew parallels being between 5 and 25. And all but one of the 17 Dialog of the Savior parallels also in Matthew and usually Luke. (Except #37 about getting naked.) Where else can you see that at a glance?
Richard Van Vliet  Hi Bob,
Thanks for the suggestion. Very interesting. I don't see why it wouldn't work with subsayings too. But as a quick check, using the chart at http://www.kingdomofthefather.com/About.html
I get the following data:
N1 = 23 Marks
N2 = 91 nonMarks
Runs = 33
And from their calculator, I get:
Expected Number of Runs: 37.7; sd: 3.4069
zvalue= 1.3852; approx. probability: 0.08299
zval= 1.238(continuity correct.); p: 0.10777
exact probability of 33 or fewer runs= 0.11837
Between one and two standard deviations. About one in ten.

But hardly a gospel, just another test. In this case, the probability of the data runs changing that many times, like my calculation of the probability of the longest data run. Easy to demonstrate cases where either would say things are perfectly normal, and they would obviously be bizarre and nonrandom algorithmically. Both are good tests. I take the difference between the two, around 90% vs. around 95% as telling us that big gap dominates the nonrandom part. For instance, doing the same calculation for the first 66 sayings...
N1 = 19 Mark
N2 = 47 nonMark
Runs = 26
And from their calculator we get:
Expected Number of Runs: 28.1; sd: 3.2939
zvalue= 0.6255; approx. probability: 0.26579
zval= 0.473(continuity correct.); p: 0.31782
exact probability of 26 or fewer runs= 0.30649
Which tells it it tends to clump together a bit, as seems obvious by looking at it, but far less than a standard deviation, definitely could be random.
I'm confident the Matthew, Luke and No parallels will be about the same. Clustered a little more than expected, but could be random.
Seems to me, something very strange is going on between 67 and 98 inclusive, yet has no impact on the Matthew, Luke and No parallels. Any theory of where Thomas comes from needs to address it.
Richard Van Vliet  Hi Ian,
I see it as demonstrating the opposite. That Thomas is the source of the parallels.
One model that explains all the nonrandom data I'm aware of, the most Occam's Razor that I know of, is that there was a Christianized version of the Gospel of Thomas, in essentially the same order, that Mark, Matthew, Luke, Dialog of the Savior, (Etc.?) used. And Mark didn't use sayings 6798 for whatever reason. That simple.
Matthew and Luke are random from beginning to end, because they used all of that Christianized Thomas.
There is a Matthew in all Mark's except 1 and a Luke in all except 2, because if it was in Mark and Christianized Thomas they took it for sure with two sources saying so.
The gaps are random, because Matthew and Luke are random, and Mark always sits on top of a Matthew/Luke, not a gap.
Dialog of the Savior has a Matthew 16 out of 17 times, almost never a gap, because it was using that same Christianized Thomas, perhaps a bit more evolved by then. It picked #37 about getting naked, and Matthew/Luke didn't, because they were prudes.
The Hebrew/Jewish parallels are overwhelmingly in the first part Mark read, perhaps because it was less secret. The clear offenders of the 10 Commandments are in the second part, not honoring one's father and mother for instance.
All patterns are true for the Greek version too.
Richard Van Vliet  Here's a simple example of how the "WaldWolfowitz Runs Test" can get randomness flat wrong...
http://www.kingdomofthefather.com/RandomnessTest.html
(A couple could be clumped together to make it perfectly random on the "WaldWolfowitz Runs Test" if desired.)
But the one with all the black squares in the left half, obviously fails the "Van Vliet Poppycock Test For Probability of Longest Run".
I suppose there may be equal or greater examples to the contrary, but they elude me at the moment.
Richard Van Vliet  At 08:27 PM 5/11/2010, kurt31416 wrote:
Here's a simple example of how the "WaldWolfowitz Runs Test" can get randomness flat wrong...
http://www.kingdomofthefather.com/RandomnessTest.html
(A couple could be clumped together to make it perfectly random on the "WaldWolfowitz Runs Test" if desired.)
I fail to see how this is "flat wrong". You don't run the numbers and you don't cite any probabilities of anything.
Please clarify.
Bob Schacht
But the one with all the black squares in the left half, obviously fails the "Van Vliet Poppycock Test For Probability of Longest Run".
I suppose there may be equal or greater examples to the contrary, but they elude me at the moment.
Richard Van Vliet
 Hi Bob,
Sorry for the weak explanation. Perhaps an example would be better. A parable so to speak...
Some cases, where all Mark parallels are in the first half of Thomas are considered perfectly random. Example available upon request.
Seems reasonable that if one found all parallels to Mark in the first half of Thomas, that's not a random distribution.
All it measures is the probability of having that many "Runs" of all blacks or all whites, it doesn't care if they're all in the first half.
I was going to derive something like it, since I confess I was ignorant of it before you brought it up. I was going to base it on the probability of the next saying being what it turned out to be, and if there was a disproportionate probability of it being the same thing instead, the probability of small scale clumping I suppose, and run down the list, it may be logically the same thing. Close enough. A fine tool, thanks. If you see someone else, that instead of the number of runs, did the longest run, before I did, please let me know.
Best Regards,
Richard Van Vliet  Rick 
I'm going to respond here to both your note to me and your note
to Bob. I'll try to be as civil as possible, but I have to admit being
steamed, because I think your purported "response" to me was
anything but. You rambled on about several subjects, but nothing
about the major concern of my note, which was to reconcile our
T5G numbers. Mind you, I'm talking about _basic_ data, not the
calculations the numbers are put through. I'm trying to make it
so that the data that goes into those calculations isn't garbage,
because you know what'll come out if it is. Furthermore, no one
is going to trust any result your calculations if they don't trust
you to get the basic input data right to begin with.
Let's take an example from your note to Bob. You wrote:> ... as a quick check, using the chart at
You then proceed to plunk the number 23 into a calculation.
> http://www.kingdomofthefather.com/About.html
> I get the following data: N1 = 23 Marks ...
But the number 23 is wrong. There aren't 23 Markan hits at the
sayings level, there's 21. I pointed out that that chart had errors in it,
and that it and your other sayingslevel chart "need to be fixed if
you're going to refer to them again." And yet that is exactly what
you didn't do. Instead, you used the erroneous numbers exactly
as originally shown. The chart in question shows L6 to be a Markan
hit, but it isn't. It shows L24 to be a Markan hit, but it isn't. It shows
L63 to be a Markan hit, but it isn't. AND it shows L33 NOT to be a
Markan hit, but it is. I'm not going to let you get away with using
either of your sayingslevel charts, or any numbers based on them,
until they're fixed.
Your response would be, I take it, that you're not sure what to
do with the "ears" subsayings that figure in L6, L24, and L63.
You wrote a rambling paragraph about that that I couldn't make
much sense of, but nothing you said alters the fact that these
subsayings simply cannot be regarded as Markan parallels,
hence the sayings within which they occur cannot be regarded
as Markan hits (sans other Markan content). So stop doing it.
Look, either you're using T5G or you're not. If you're applying
your own modifications to their data, then your data is worthless
unless you specify exactly how your data differs from theirs and
why. If you're interested in other data bases as well, fine, but
if someone challenges you on your handling of a particular
data base, such as I've done with respect to T5G, you need to
meet that challenge headon. So far, you haven't done so.
Mike G.  Mike, in a nutshell, the data is the parallels and Cf's listed in the margin of the Five Gospels, by subsayings. I believe you used the "sources". We both recorded diffent things from the margin.
No one is modifying the data as far as I know, certainly not me.
And the math comes out about the same including the "he who has ears" or not, as I previously demonstrated and can again, sayings level or subsayings level. As I said, it was a quick check.
Richard Van Vliet.  At 09:38 PM 5/11/2010, kurt31416 wrote:
Hi Bob,
Sorry for the weak explanation. Perhaps an example would be better. A parable so to speak...
Some cases, where all Mark parallels are in the first half of Thomas are considered perfectly random. Example available upon request.
Seems reasonable that if one found all parallels to Mark in the first half of Thomas, that's not a random distribution. ...
You are leaving a confusion of issues here. Best to keep straight exactly what the hypothesis is for any given test.
For the Wald Wolfowitz runs test for GTh as a whole, the hypothesis is that the Markan parallels are distributed at random throughout the document as a whole (and that the "gap" in the middle is a fluke.)
But now you want to chop up the document into 3? sections, and you appear to have special hypotheses for each section. It is quite possible that the Markan parallels could be randomly distributed from the first segment, utterly absent in a nonrandom way from the second segment, and randomly distributed in the third segment. Thus segmentized, you would have to run the statistics separately on each segment, or else use a ChiSquare test for the three segments by presence or absence of Markan Parallels. The hypothesis involved there is that GTh actually consists of three contiguous documents that the scribes treated differently for some reason (e.g., they drew randomly from Mark for segments 1 & 3, and not at all from Mark in segment 2 Or, that Mark drew at random from segments 1 & 3, but for some reason ignored segment 2 entirely).
However, there is a flaw in using the Waldwolfowitz runs test at all, which I alluded to in previous correspondence: Our situation with GTh and Markan parallels seems to be a "sample without replacement," if it is the case that once a Markan parallel is used, it is no longer available to be used again elsewhere in GTh (is that true?) I believe that the WW runs test assumes sampling *with* replacement, IIRC.
Bob Schacht  Hi Bob,
I've updated that page to give a specific example, where the WaldWolfowitz runs test considers the "21" Mark parallels perfectly random, with them all in the first 39 sayings of Thomas. Just do the first three and then every one after that.
N1 = 21
N2 = 93
Expected runs = 35.3
It's actually a little over spread out according to WaldWolfowitz. Ending it with #37 is equally random.
You may need to hit refresh.
http://www.kingdomofthefather.com/RandomnessTest.html
Surely if all 21 Mark parallels were in the first 39 sayings it's probably not random.
Richard Van Vliet  Hi Bob,
I'm not chopping up Thomas when I say all the Mark parallels could be in the first third and it would calculate out as being random. I'm considering all 114 sayings and pointing out some of the logical possibilities, where the test fails miserably.
Richard Van Vliet  Richard V. wrote:
> Mike, in a nutshell, the data [I used] is the parallels and Cf's
That may account for some of the differences, but also since
> listed in the margin of the Five Gospels, by subsayings. I
> believe you used the "sources". We both recorded diffe[re]nt
> things from the margin.
you haven't provided any detail behind your summary counts,
I have to assume that you get your numbers by manually
counting your colorcoded items. Such a process is prone to
error, both in the counting and in the original colorcoding.
Indeed, even if the original colorcoding is accurate, you might
come up with different numbers every time you count. Needless
to say, that doesn't instill confidence in your numbers. What's
needed is to transform your colorcoding into little bitty numbers
that the computer can add up, and which you can share with
others for verification. And don't say, "Be my guest". You should
have done it yourself long ago. It'd be much more helpful than
the splashy graphics.
> No one is modifying the data as far as I know, certainly not me.
Well maybe not intentionally, but that was the effect of what you
did with the ears subsayings. Not taking the "Source" line in T5G
(which attributes these as "common lore") at face value, you read
the line above it (which says "Mk 4.9, etc.") as indicating a Markan
parallel  unfortunately ignoring the import of the "etc". In fact, ears
subsayings occur all over the NT, notably in Rev (7 times for the
7 churches, plus one more for good measure later on).
Aside from that, you've implied that your numbers represent
T5G parallels, when in fact the inclusion of cf's makes that claim
questionable, to say the least. While this may not count as
modifying the T5G data, it may count as misrepresenting it.
Mike  At 10:44 PM 5/11/2010, kurt31416 wrote:
Hi Bob,
I'm not chopping up Thomas when I say all the Mark parallels could be in the first third and it would calculate out as being random.
Have you done the calculations to demonstrate this? If so, show us your work.
Or is this just a thought experiment?
What do you mean by "calculate out as being random"? What p value do you get?
Of course there is a possibility that this could happen, but it's not likely.
Bob Schacht  Hi Bob,
Yes, I'll be happy to spell out the calculations explicitly. Might be some skimming this that don't realize how easy it is. The WaldWolfowitz Runs Test is straightforward. Referring to the bottom figure at...
http://www.kingdomofthefather.com/RandomnessTest.html
...Which refers to one of the many logical possibilities of all Mark parallels being in the first half. In this case, the first 39 sayings.
Only three things to count.
N1 = # of Mark Sayings = 21
N2 = # of nonMark Sayings = 11421 = 93
And the only one that might appear complicated, but is straightforward...
Runs = number of all Mark and all nonMark blocks = 37 (The three black ones at the beginning is 1, and then each black and white after is 1 and the 11439 sayings after 39 being 1.
Plugging those three into your excellent online calculator at...
http://www.quantitativeskills.com/sisa/statistics/ordinal.htm
...we get...
Expected Number of Runs: 35.3; sd: 3.1758
zvalue= 0.54689; approx. probability: 0.70777
zval= 0.7043(continuity correct.); p: 0.75939
exact probability of 37 or fewer runs= 0.77331
Less than one standard deviation, and very average. I deliberately made it high like that, because I wanted to overestimate the number of sayings I could cram them into. For instance, in your mind, put that black #39 Mark in saying 4 where there's now a nonMark. In other words, 5 Marks in a row at the beginning and the last Mark at saying 37. In that case...
N1 = 21 like before
N2 = 93 like before
Runs = 34 three less than before
Plugging those into the online calculator we get...
Expected Number of Runs: 35.3; sd: 3.1758
zvalue= 0.3977; approx. probability: 0.34541
zval= 0.240(continuity correct.); p: 0.40505
exact probability of 34 or fewer runs= 0.36981
Also very random. If I played with it, and made one with 35 Runs, half way between the two, it would be almost perfectly random as the "Expected Number of Runs: 35.3" told us all along.

Therefore, the WaldWolfowitz Runs Test fails miserably. It says it's perfectly random to have all 21 Mark sayings located in the first 35 sayings. And we don't need math to know that's false.

I can generate thousands, maybe millions of similar examples of all Mark saying in the first half, and the WaldWolfowitz Runs Test saying it's perfectly random.
Can anyone state one single case where the Van Vliet Longest Run Test fails even remotely as miserably? I confess, I can't think of one.
The bottom line is that the WaldWolfowitz Runs Test is worthless for evaluating large gaps. It only checks the "grainyness", not the big chunks.
Richard Van Vliet  Mike,
1. They don't include Cf's. I included the Cf's on the site, but not in the calculations. it's pure parallels.
2. The "He who has ears" wasn't counted as Mark on the site. It's gray, meaning none. White would be all three, Red would be Mark. (And fine, I agree if one had to pick one, excluding it is probably best, but it's a special exemption, and far from clear it didn't come from Jesus, as far as I know, and far from clear the location of it's parallels is of no merit. And as I've demonstrated, it doesn't dramaticly alter the result.)
3. I'm gonna stick with fancy graphics. I think it's key to making progress in Biblical Scholarship. It's the graphics that tipped me off about that unusual Mark gap, (and the other nonrandom sortings.) I've reported the results of my experiment, and given the tools where others can do the experiment themselves. Counting those colors is far faster than any other method.
4. Is there a saying where it appears it's not the parallels? (Or actually isn't)?
Richard Van Vliet
>  I poorly said:
" 1. They don't include Cf's. I included the Cf's on the site, but not in the calculations. it's pure parallels."
The Cf's are only on the individual sayings pages, and have nothing to do with the Synoptic Rainbow, the color coding by T5G subparallel. The Cf's have never been involved in any math I've done here.
