I have just posted this note to the genealogy-dna@... forum:

I have had an interesting insight that I am working on. I can derive a dated Y-DNA phylogenetic tree, given a set of haplotypes that can be arbitrarily long. For a haplotype length of 37 markers, we can derive a time scale from a large number of pedigrees that gives the result that 10 RCC is about 433 years. The insight follows:

In our dated Y-DNA phylogenetic tree, if you count (along a constant RCC line on the tree) the number of times that a descendant line is crossed, that number N is related to RCC by an exponential of the form: N equals K times e to the power ax, where:

• N is the number of times a descendant line is crossed at each value of RCC on the tree,

• K is the number of testees in the sample of haplotypes we use to form the tree,

• x is RCC (a time scale derived from over 100 testee pedigrees),

• e is 2.71828...., the base of the natural logarithm,

• and 'a' is a constant of the set. Let's call 'a' the "tree factor".

Call this relation, "the tree equation".

For our phylogenetic trees, 'a' is a negative number and probably is composed of factors that include:

the average number of sons along the descendant lines

the average rate at which descendant lines die out

the average mutation rate in the set of testees

characteristics of the testee set chosen for the tree, etc.

The tree equation is not a perfect exponential. It has glitches in it, but the quality of the exponential relationship can be quite high, with values of Rsquared (the variance) exceeding 0.9.

The fact that the relationship is exponential is not unexpected, since the growth of the world's population is also exponential.

This insight provides additional impetus to understanding what a well-defined set of inputs to the phylogenetic tree can tell us about the evolution of haplotypes, the dates of origin of family surname clusters and SNPs, and subhaplogroups. That date of origin is where N=1 in the tree equation and it can be found either from the equation or from an extrapolation of the graph of N vs. RCC from which the tree was derived!

===============================

Sincerely, Bill Howard