Concerning your question whether the Byz is given undue weight, my answer is no because each of the five varieties of text identified by PAM analysis is given equal weight after the Vulgate variety is dropped as manifestly secondary. All of the witnesses in (1) the Byzantine group are represented by one entity (Byz), all in (2) the Fam. 1 group are represented by one entity (minuscule 205), and so on for (3) the variety represented by arm, (4) that represented by B, and (5) that represented by it-ff-2. Representing a group of witnesses by a single text (the medoid) avoids the problem of giving too much weight to groups which have more members (e.g. Byz).
Some will say it is an error to give these five representatives equal weight when reconstructing the initial text. However, I prefer equal weights, treating each medoid as a primary rather than secondary witness to the initial text. If one of these varieties could be proven to be secondary (i.e. a synthesis of some combination of the others) then it would have to be dropped in the same way as the Vulgate variety. Many would say that the B (formerly called "Alexandrian") and it-ff-2 (formerly called "Western") varieties should be given more weight than the other three; however, as an initial hypothesis, I presume that all five have approximately equal merit as witnesses to the initial text.
Concerning your question whether "better" texts are far from Rec-6-1 (i.e. the text established by taking the most frequent reading across group representatives), you have read the rankings correctly. The figures in parentheses are distances, meaning 205 is closest to Rec-6-1 at a distance of 0.120 and it-k (Old Latin k) is farthest from Rec-6-1 at a distance of 0.625. (The distances have no unit as they are ratios of pure numbers.)
--- In email@example.com, Jake <bikerxtrash@...> wrote:
> I think your results are intriguing. I have one question for you though.
> It is my understanding that most of the manuscripts are Byz in nature.
> So by including all the variants, doesn't your methodology weight the
> results in favor of the Byz text? It struck me as particularly
> interesting that the conventional 'better' manuscripts are farther away
> in your rankings from the hypothetical original Rec-6-1 than Byz and the
> early versions. Or is one to interpret the rankings so that a greater
> distance indicates a better text?
> Jake Horner
> MDiv candidate, Pittsburgh Theological Seminary
> On 8/1/2013 6:13 AM, yennifmit wrote:
> > Hi Everyone,
> > It occurred to me that I should drop the Vulgate from the group
> > representatives used to recover the initial text because it is known
> > to be secondary.
> > Using PAM to divide the UBS4 data set for Mark into six groups
> > identifies these medoids: B, Byz, it-ff-2, arm, vg, 205. Dropping vg
> > and finding the most frequent reading across the five remaining
> > medoids recovers the text labelled Rec-6-1 in this table:
> > http://www.tfinney.net/Views/data/Mark-UBS4-Rec-6-1.csv
> > MVA of the UBS4-Mark data matrix with Rec-6-1 added produces these
> > CMDS and DC results:
> > http://www.tfinney.net/Views/cmds/Mark-UBS4-Rec-1.15.SMD.gif
> > http://www.tfinney.net/Views/dc/Mark-UBS4-Rec-1.15.SMD.png
> > The recovered text is located near Families 1 and 13 in the CMDS map
> > and is in the same branch as Families 1 and 13 in the DC dendrogram.
> > The MSW plot suggests 5- and 10-way partitions. A 5-way partition puts
> > Rec-6-1 in this group: W, Theta, f-1, f-13, 28, 205, 565, 700, syr-s,
> > arm, geo. However, it does not fit well as indicated by a negative
> > value of a statistic called the silhouette width. A 10-way partition
> > places Rec-6-1 in this group: f-1, f-13, 28, 205, 700, syr-s, arm,
> > geo. In this case the sil. width for Rec-6-1 is positive, indicating a
> > better fit.
> > Ranking witnesses by distance from Rec-6-1 produces this list:
> > 205 (0.120); f-1 (0.149); Lect (0.208); 597 (0.226); Byz (0.227); G
> > (0.232); 1292 (0.235); 180 (0.239); 1243 (0.239); 1010 (0.246); 1006
> > (0.248); slav (0.248); 28 (0.256); 700 (0.256); E (0.256); 1505
> > (0.259); 157 (0.268); 1424 (0.270); H (0.270); 1241 (0.274); F
> > (0.274); Sigma (0.286); vg (0.287); arm (0.290); A (0.291); geo
> > (0.291); f-13 (0.299); 1071 (0.299); syr-h (0.302); it-q (0.318); 33
> > (0.322); it-l (0.324); syr-p (0.330); 565 (0.353); it-aur (0.360);
> > Augustine (0.364*); N (0.370*); 1342 (0.371); syr-s (0.373*); 579
> > (0.379); Theta (0.398*); eth (0.408*); syr-pal (0.409*); it-f
> > (0.411*); 892 (0.414*); cop-bo (0.434*); it-ff-2 (0.435*); C (0.438*);
> > L (0.438*); it-i (0.447*); Delta (0.448*); W (0.452*); cop-sa
> > (0.453*); 2427 (0.477*); it-c (0.491*); it-b (0.495*); UBS (0.496*);
> > Psi (0.500*); it-r-1 (0.526*); B (0.558); Aleph (0.564); it-a (0.586);
> > it-d (0.591); D (0.602); it-k (0.625)
> > So, when the Vulgate is dropped from medoids identified by a 6-way
> > partition of the UBS4-Mark data set and a text is recovered by taking
> > the most frequent reading of the remaining five medoids, CMDS, DC,
> > PAM, and ranking results indicate that the recovered text is in the
> > vicinity of Family 1.
> > Best,
> > Tim Finney