## Fw: [Synoptic-L] Some numerical results

> Hello again,
> More comments in-line. (with *****)
> > Now, if we do this for every word, and we start to see that words that
> > are more frequent than expected in "222" are also more frequent than
> > expected in "221" then we are seeing that "222" and "221" have a
> > similiar prefference for words.
> If I am understanding the experiment correctly, I wonder here (and
> elsewhere) about the problem of circularity. Surely by definition we
> will expect 222 and 221 to have a similar preference for words? The
> Matthew // Mark direct parallel common to both of these categories
> will ensure that the results are at least pretty similar, won't they?
> That's not necessarily because of common "authorship" or traditions;
> it's just that the way the experiment is set up makes it inevitable
> that 222 and 221 will come out similarly, as also 122 and 222. Or is
> that not right?
> *****Well, remember that we are already adjusting for general frequency,
so
> we're looking for different tendencies.
> Let's look at some examples to see if they conform with expectations.
> 112 and 211 and 121 should all look different. They do.
> 002 and 200 and 020 should all look different. They do.
> 112 looks like 002
> 211 looks like 200
> The fact that it can tell us that much of what we already know should give
> us some confidence. So when it fails to say that 020 looks like 121, we
> have to ask why.
> Let's look at 121 221 and 211.
> Let's say we have 3 different hypotheses.
> A) Mark copied Matthew
> B) Matthew copied Mark
> C) They both copied a common source.
> What would we expect to see?
>
> If B is true then 121 and 221 are just different samples of the same
> original text.
> Both were written by Mark. They should look the same.
> but 211 represents words that Matthew choose, we would expect it to look
> different.
>
> If A were true, we'd expect the opposite.
>
> If C were true then 211 might look like 221 AND 121 might look like 221,
> since
> both reflect the original text.
>
> The results:
> 211 - 221 - uncorrected
> 121 - 221 - correlated at 99.99% level.
> We can reject hypothesis A, and favor B. (We can't completely eliminate C)
>
>
> > Now I'm using alpha-delta, so there is more data. I'll add more, with
> > time. I think you have my intent correct. If "200" has a similiar
> > style as "211" we can view Matthew as one authors style. If they look
> > differant that we suspect Matthew was coping another source in once
> > place, and creating on his own in another.
> I don't know; would that be the only legitimate conclusion? If 200
> and 211
> look different, surely it might be that Matthew's attitude to to
> triple tradition material differed from his atttitude to Sondergut.
>
> **** Doesn't that imply at least one source?
> That was my point.
>
> And there are the questions Stephen raised about genre and the like.
> Q theorists have long known, for example, that Matthew and Luke are
> much closer together in Q material than they are in Markan material,
> something that does not cause them to question their hypothesis.
> **** I don't think this is directly related to the question.
> On the two source hypothesis:
> 201 should look like 202
> 102 should look like 202
> They do.
>
> 221 should look like 121.
> 221 should not look like 211.
>
> Again true.
>
> 122 should look like 121.
> It does.
>
> 122 should not look like 112.
> Oops, it does.
>
>
>
> Rather it suggests to them that Matthew's and Luke's mutual attitudes
> to Q overall differed from their attitudes to Mark overall, perhaps
> influenced further by the fact that the double tradition is
> proportionally more sayings-rich than is the triple tradition.
> ****Again, I think you're slightly off track here. Although, I may have
just missed your point. >That's true, but not directly relevant. At least as
far as I can see.
> > 121 and 120 we would expect to be correlated on almost any
> > hypothesis.
> > For example, we might notice the word "immediately" appears in both
> > 121 and 120 a lot more that in say "002" or in all categories in
> > general. 121 and 120 seem to have a similar choice of words. They are
> > likely both written by the same author, who liked the word
> > "immediately".
>
> Does it necessarily imply that? Would one not simply expect a rough
> correlation between 121 and 120 by definition, because of the common
> 12- element? Isn't this much the same as with 222 and 221 (etc.)
> above? Likewise, we wouldn't expect 121 and 120 to show a rough
> correlation with 002, again by definition, because the latter is
> constituted by those words without direct or indirect parallel in
> Mark and Matthew, the very thing that is key in 121 and 120.
> ****Would we expect 002 to show a rough correlation with 121?
> Absolutely not, once we've adjusted for general frequency.
>
> They are anti-correlated in fact.
>
> Let me do thought the steps, using the "rock" analogy.
> Say I'm trying to tell the difference between some rock samples, and
> I have a count of different types of atoms.
> If I feed these raw numbers into a pattern recognizer, it
> first tells me some rocks have more atoms that others.
> (Some rocks are big.)
> This would be the same as saying that 200 has more words than 212.
> So I convert to frequencies.
> Now I ask the pattern recognition tool again, and it says
> these things all have a lot more iron than gold, in short
> they are all rocks. Great.
> This would be like discovering all the texts use "the" a lot
> more than "donkey". So we adjust by comparing the rocks to average rocks,
> or the text to the average text. Now we can find out that rock A and B
both
> have
> more iron than rock C. Or text A has more "immediately"s than text B.
>
>
> > The fact that 121 and 120 correlate does not say
> > anything surprising. In just confirms that the procedure is working.
> > (If 121 and 120 did not look related, we might question the test) But
> > when 122 and 112 look similar, we are saying something more
> > significant. We are saying that Mark/Luke agreement has a style that
> > matches Luke alone to some extent. This could be explained by Luke and
> > Mark copying a source and Mark altering it sometimes. The fact that
> > 122 looks like 112 and that 122 looks like 121, strongly suggests that
> > Mark and Luke independently copy and alters a source, so that both
> > look somewhat like the source.
> Again, I'm not yet convinced of this. By definition, surely we would
> expect 122 to show a rough correlation with 112, and 122 with 121,
> and so on, because there is commonality in respective definitions of
> these categories. But even if this were not the case, I don't think
> your conclusion would follow from the data. It could be that when
> Luke reads Mark, he tends to take over the most congenial ("Luke-
> pleasing") words, the same words he tends to add himself where Mark
> is not directly parallel, and in this way 112's correlation with 122
> would be exactly what Luke's use of Mark would lead us to expect.
>
> ****Let's say "hat" is a Luke-pleasing word. Luke would use it in text he
> wrote himself. (002) He likes to use it 10% of the time.
> But Mark's text only has "hat" 1% of the time.
>
>
> If he alters Mark's text he can make 112 look like 002.
> He can't make 122 look like 002.
>
> Could we construct specific examples with specific words to fool it. Yes.
> Is it realistic to think that Luke, by copying Mark could cause
> 112 to look like Mark for all words? Not very likely. He'd have to edit
> Mark
> in Mark's style.
>
> I think we're asking the right questions, in any case.
> :o)
> Dave Gentile
> Riverside Illinois
> M.S. Physics
> PhD Management Science candidate
Thanks for the further clarifications and explanations. If I may come back with further perhaps misguided questions as I try to get my mind around this:
Thanks for the further clarifications and explanations. If I may
come back with further perhaps misguided questions as I try to get my
mind around this:

I wrote:

> > If I am understanding the experiment correctly, I wonder here (and
> > elsewhere) about the problem of circularity. Surely by definition
> > we will expect 222 and 221 to have a similar preference for words?
> > The Matthew // Mark direct parallel common to both of these
> > categories will ensure that the results are at least pretty similar,
> > won't they? That's not necessarily because of common "authorship" or
> > traditions; it's just that the way the experiment is set up makes it inevitable
> > that 222 and 221 will come out similarly, as also 122 and 222. Or
> > is that not right?

Dave replied:

> > *****Well, remember that we are already adjusting for general
> > frequency, so we're looking for different tendencies.

Sure, I understand that, but what I am trying to get my head round is
the fact that 221 and 222 (to stick to this example) by definition
are very similar types of material and so, by definition, will end up
producing similar results. We are talking here about words that are
Matthew // Mark // Luke and words that are Matthew // Mark, diff.
Luke. The common element in both is Matthew // Mark. Before we have
even asked the question about authorship, traditions, etc., we would
expect the similar profile of the section to produce a similar
profile of results, wouldn't we? If not, can you explain to me why
not?

Dave wrote:

> > Let's look at 121 221 and 211.
> > Let's say we have 3 different hypotheses.
> > A) Mark copied Matthew
> > B) Matthew copied Mark
> > C) They both copied a common source.
> >
> > What would we expect to see?
> >
> > If B is true then 121 and 221 are just different samples of the same
> > original text. Both were written by Mark. They should look the same.
> > but 211 represents words that Matthew choose, we would expect it to
> > look different.

This gets to my main concern with the project as you are currently
expressing it. The move from correlations / anti-correlations
between material to source-critical inferences is a bit too quick for
me -- I'd like to see the assumptions behind the source critical
inferences clearly spelled out. For example here, for the sake of
argument, would we necessarily expect 121 and 221 to "look the same"?
They are clearly not "just different samples of the same original
text" in the sense of random samples. 221 is constituted by words
that are Matthew // Mark, diff. Luke and 121 are words at are
constituted by Matthew, diff. Mark, diff. Luke. The 221 words, on
the assumption of Markan Priority, are the words Matthew finds
congenial in Mark; the 121 words are the words both Matthew and Luke
find uncongenial. In other words, it wouldn't necessarily be
surprising to see a difference there. As it happens, they come out
similarly, but I don't find that particularly striking because it's
not necessarily what I would have expected. What I am interested to
know is, how can you be so confident in your expectation of what
Matthew's use of Mark would produce? I don't know that I would have
the same expectation. [This example has the advantage of being
uncontroversial to the extent that I accept Markan Priority so
there's no worry I'm grinding an axe in disputing your expectation.]

> > If A were true, we'd expect the opposite.

Again, could you explain why we'd expect that?

Dave had written:

> > > Now I'm using alpha-delta, so there is more data. I'll add more,
> > > with time. I think you have my intent correct. If "200" has a
> > > similiar style as "211" we can view Matthew as one authors style.
> > > If they look differant that we suspect Matthew was coping another
> > > source in once place, and creating on his own in another.

I asked:

> > I don't know; would that be the only legitimate conclusion? If 200
> > and 211 look different, surely it might be that Matthew's attitude
> > to to triple tradition material differed from his atttitude to
> > Sondergut.

Dave replied:

> > **** Doesn't that imply at least one source?
> > That was my point.

It comes down again to the question of expectations. The point I am
trying to make is: how do we know what the expectation should be
when comparing 211 (Matthew, diff. Mark, diff. Luke) to 200 (Matthean
Sondergut)? Let's say they differ: would that necessarily mean that
"Matthew was copying another source in one place, and creating on his
own in another"? No, not at all: this is one possible explanation
of the difference, but another would be that he treated different
source material in different ways (e.g. one might assume that M was
largely oral so was treated differently from Mark, or one might
assume that he liked M more than Mark, vice versa, etc. etc.). Once
again, I am struck by your confidence in inferring source-critical
conclusions from results that might be explained in other ways.

I had written:

> > And there are the questions Stephen raised about genre and the like.
> > Q theorists have long known, for example, that Matthew and Luke are
> > much closer together in Q material than they are in Markan material,
> > something that does not cause them to question their hypothesis.

Dave replied:

> > **** I don't think this is directly related to the question.

I was attempting again to draw attention to possible different
explanations for anti-correlations in the different types of
material. The anti-correlations will not necessarily be because of
different authorship but may be because of different attitudes to
source material depending on that material's origin, genre,
theological and literary character etc.

Dave had written:

> > The fact that 121 and 120 correlate does not say
> > anything surprising. In just confirms that the procedure is working.
> > (If 121 and 120 did not look related, we might question the test) But
> > when 122 and 112 look similar, we are saying something more
> > significant. We are saying that Mark/Luke agreement has a style that
> > matches Luke alone to some extent. This could be explained by Luke and
> > Mark copying a source and Mark altering it sometimes. The fact that
> > 122 looks like 112 and that 122 looks like 121, strongly suggests that
> > Mark and Luke independently copy and alters a source, so that both
> > look somewhat like the source.

I commented:

> Again, I'm not yet convinced of this. By definition, surely we would
> expect 122 to show a rough correlation with 112, and 122 with 121,
> and so on, because there is commonality in respective definitions of
> these categories. But even if this were not the case, I don't think
> your conclusion would follow from the data. It could be that when
> Luke reads Mark, he tends to take over the most congenial ("Luke-
> pleasing") words, the same words he tends to add himself where Mark
> is not directly parallel, and in this way 112's correlation with 122
> would be exactly what Luke's use of Mark would lead us to expect.

Dave replied:

> ****Let's say "hat" is a Luke-pleasing word. Luke would use it in text he
> wrote himself. (002) He likes to use it 10% of the time.
> But Mark's text only has "hat" 1% of the time.
> If he alters Mark's text he can make 112 look like 002.
> He can't make 122 look like 002.

On sentence one, without wanting to be pernickety, we don't know that
002 are texts that Luke wrote himself; so in this hypothetical
example, L might have liked the word "hat" very much and Luke's 10
per cent usage might be because of Luke's conservative treatment of L
(cf., for example Paffenroth's or Schurmann's views on L). But that
aside, whenever Luke takes over a word from Mark, i.e. in 122 and 222
material, Luke influences the way that the 122 and the 222 material
is constituted. Thus Mark overall (122, 221, 222, 020, 121 etc.)
might have "hat" one per cent of the time but when Luke takes over
all those usages of "hat", it forms a much higher proportion of the
122 and 222 material, perhaps 10 per cent of that material. And that
proportion of Luke-pleasing references to "hat" might be similar to
the proportion Luke actually introduces into other material he
authored, 112, 002 and the like, especially if Luke felt that there
were proportionally a good number of hats already present in 122 and
222.

As I say, some of these questions may be off the mark -- this
statistical stuff
is far from my area of expertise but it captures my interest so
please forgive any dumbness here.

Thanks
Mark-----------------------------
Dr Mark Goodacre mailto:M.S.Goodacre@...
Dept of Theology tel: +44 121 414 7512
University of Birmingham fax: +44 121 414 4381
Birmingham B15 2TT
United Kingdom

http://www.bham.ac.uk/theology/goodacre
Homepage
http://NTGateway.com
The New Testament Gateway

Synoptic-L Homepage: http://www.bham.ac.uk/theology/synoptic-l
List Owner: Synoptic-L-Owner@...
Would I have expected 222 and 221 to look a lot alike? Yes. That 222 looks nothing like expected is the one result I found very surprising. I sort of
Message 3 of 3 , Dec 1, 2001
• 0 Attachment
> Sure, I understand that, but what I am trying to get my head round is
> the fact that 221 and 222 (to stick to this example) by definition
> are very similar types of material and so, by definition, will end up
> producing similar results. We are talking here about words that are
> Matthew // Mark // Luke and words that are Matthew // Mark, diff.
> Luke. The common element in both is Matthew // Mark. Before we have
> even asked the question about authorship, traditions, etc., we would
> expect the similar profile of the section to produce a similar
> profile of results, wouldn't we? If not, can you explain to me why
> not?
Would I have expected 222 and 221 to look a lot alike? Yes.
That 222 looks nothing like expected is the one result I found very
surprising.
I sort of expected evidence of a proto-Mt and a proto-Mk/Lk, but I did not
expect 222 to behave as it does. On a Markian or Matthian priority
hypothesis
222 and 221 should be almost the same. But if Luke wrote first, then 221
could
be text altered by Mark and the copied by Matthew from Mark. In that case
I would probably see an anti-correlation between 222 and 221.
112 would look like 222.

I think the real structure must be more complicated, to produce the result
seen.

> This gets to my main concern with the project as you are currently
> expressing it. The move from correlations / anti-correlations
> between material to source-critical inferences is a bit too quick for
> me -- I'd like to see the assumptions behind the source critical
> inferences clearly spelled out. For example here, for the sake of
> argument, would we necessarily expect 121 and 221 to "look the same"?
> They are clearly not "just different samples of the same original
> text" in the sense of random samples. 221 is constituted by words
> that are Matthew // Mark, diff. Luke and 121 are words at are
> constituted by Matthew, diff. Mark, diff. Luke. The 221 words, on
> the assumption of Markan Priority, are the words Matthew finds
> congenial in Mark; the 121 words are the words both Matthew and Luke
> find uncongenial. In other words, it wouldn't necessarily be
> surprising to see a difference there. As it happens, they come out
> similarly, but I don't find that particularly striking because it's
> not necessarily what I would have expected. What I am interested to
> know is, how can you be so confident in your expectation of what
> Matthew's use of Mark would produce? I don't know that I would have
> the same expectation. [This example has the advantage of being
> uncontroversial to the extent that I accept Markan Priority so
> there's no worry I'm grinding an axe in disputing your expectation.]

Matthew could certainly cause 221 to move away from 121, by the method you
describe. In fact, since 121 looks a lot more like 120 than 121 looks like
221,
I suspect something like what you describe. However, lets say Mark never
uses the word "above" (he likes "over", "overhead", or something else),
221 and 121 will never have the word "above", nothing Matthew can do
can change that. If "above" is a common word, the lack of it in 221 an 121
will make them tend to correlate, and Matthew can not change that.

> It comes down again to the question of expectations. The point I am
> trying to make is: how do we know what the expectation should be
> when comparing 211 (Matthew, diff. Mark, diff. Luke) to 200 (Matthean
> Sondergut)? Let's say they differ: would that necessarily mean that
> "Matthew was copying another source in one place, and creating on his
> own in another"? No, not at all: this is one possible explanation
> of the difference, but another would be that he treated different
> source material in different ways (e.g. one might assume that M was
> largely oral so was treated differently from Mark, or one might
> assume that he liked M more than Mark, vice versa, etc. etc.). Once
> again, I am struck by your confidence in inferring source-critical
> conclusions from results that might be explained in other ways.

Yes, 211 being the result of one Mathian source and 200 being the result of
another is certainly possible. My only point here was that a Matthian
priority
hypothesis has a problem with this.

> Dave had written:
>
> > > The fact that 121 and 120 correlate does not say
> > > anything surprising. In just confirms that the procedure is working.
> > > (If 121 and 120 did not look related, we might question the test) But
> > > when 122 and 112 look similar, we are saying something more
> > > significant. We are saying that Mark/Luke agreement has a style that
> > > matches Luke alone to some extent. This could be explained by Luke and
> > > Mark copying a source and Mark altering it sometimes. The fact that
> > > 122 looks like 112 and that 122 looks like 121, strongly suggests that
> > > Mark and Luke independently copy and alters a source, so that both
> > > look somewhat like the source.
>
> > Again, I'm not yet convinced of this. By definition, surely we would
> > expect 122 to show a rough correlation with 112, and 122 with 121,
> > and so on, because there is commonality in respective definitions of
> > these categories. But even if this were not the case, I don't think
> > your conclusion would follow from the data. It could be that when
> > Luke reads Mark, he tends to take over the most congenial ("Luke-
> > pleasing") words, the same words he tends to add himself where Mark
> > is not directly parallel, and in this way 112's correlation with 122
> > would be exactly what Luke's use of Mark would lead us to expect.
>
> > ****Let's say "hat" is a Luke-pleasing word. Luke would use it in text
he
> > wrote himself. (002) He likes to use it 10% of the time.
> > But Mark's text only has "hat" 1% of the time.
> >
> > If he alters Mark's text he can make 112 look like 002.
> > He can't make 122 look like 002.
> On sentence one, without wanting to be pernickety, we don't know that
> 002 are texts that Luke wrote himself; so in this hypothetical
> example, L might have liked the word "hat" very much and Luke's 10
> per cent usage might be because of Luke's conservative treatment of L
> (cf., for example Paffenroth's or Schurmann's views on L). But that
> aside, whenever Luke takes over a word from Mark, i.e. in 122 and 222
> material, Luke influences the way that the 122 and the 222 material
> is constituted. Thus Mark overall (122, 221, 222, 020, 121 etc.)
> might have "hat" one per cent of the time but when Luke takes over
> all those usages of "hat", it forms a much higher proportion of the
> 122 and 222 material, perhaps 10 per cent of that material. And that
> proportion of Luke-pleasing references to "hat" might be similar to
> the proportion Luke actually introduces into other material he
> authored, 112, 002 and the like, especially if Luke felt that there
> were proportionally a good number of hats already present in 122 and
> 222.
If Luke uses Mark about half the time, and 122 and 121 have about
the same number of words, then if Mark had 1% "hat", and Luke kept
every "hat", then 122 will be 2% hat, and 121 will be 0% "hat".
If Luke has 10% "hat", 122 still looks a lot more like 121, than 122 looks
like 112 or 002.

Let's go the other direction. Say Mark is 10% "cow". Luke hates cows. 122 is
0% cow,
002 is 0% cow, and 121 is now 20% cow. So, by removal of displeasing words,
Luke
seems to be able to make 122 more Luke-like.

But these are not representative of the majority of words. Some words with a
theological implication, we can understand Luke removing. But Luke is not
going to be removing basic units of speech, with real determination.

We know that Mark uses a smaller vocabulary. Luke can not make 122 look like
it is sampled from a large vocabulary text.

Still, you're right, Luke can exert influence. I just doubt it is enough to
make
122 look so very much like 112 and 002. It seems much simpler to think
that 112 may sometimes reflect the same original source text 122 does.
If 112 and 122 looking similiar does not indicate a common source,
then it is hard to see what evidence could be used to indicate an
older source.

But perhaps the real question is, why does
122 look like both 121 and 112, but 221 look only like 121 and not 211?

> As I say, some of these questions may be off the mark -- this
> statistical stuff
> is far from my area of expertise but it captures my interest so
> please forgive any dumbness here.

Well, I guess we're in the same boat, since we're both outside our area
here. :o)
But, these are good questions. All the study told us is that some things
look alike
and some look differant. It's now our job to figure out why.

Dave Gentile
Riverside, Illinois
M.S. Physics
PhD Management Science candidate

