## [ai-geostats] Re: Why degree of freedom is n-1

Expand Messages
• Hi Eric What complications! You should find, in any basic statistical inference that the correlation is divided by (n-1) and has (n-2) degrees of freedom. The
Message 1 of 1 , Aug 31, 2005
Hi Eric

What complications! You should find, in any basic statistical inference that the correlation is divided by (n-1) and has (n-2) degrees of freedom.

The logic behind this is because the correlation is actually calculated as the covariance divided by the two standard deviations.

The covariance is calculated from n PAIRS of samples, not 2n individual observations and has (n-1) degrees of freedom because it uses the pair of means (m1,m2) as its centroid.

Dividing by the pair (s1,s2) loses you the other degree of freedom. Tests on the correlation have (n-2) degrees of freedom.

If you use (say) a regression relationship with 'k' coefficients including the constraint of the means, you lose k degrees of freedom. Any book which deals with 'Analysis of variance' will explain this for you. We use exactly this approach for testing a trend surface (see free tutorial at http://geoecosse.bizland.com/softwares or download my SNARK (1977) paper from http://uk.geocities.com/drisobelclark/resume).

Hope this helps.
Isobel

Eric.Lewin@... wrote:
This follow-up is slighlty aside the subject line of the mailing list, but
as a geologist, this is the only statistically-flavoured one I am
subscribed to. Therefore :

Federico Pardo said:
> Having N samples, and then n degrees of freedom.
> One degree of freedom is used (or taken) by the mean calculation.
> Then when you calculate the variance or the standard deviation, you only
> have left n-1 degrees of freedom.

Apart a rigorous calculation I am aware of that in this very case (cf.
Peter Bossew's contribution on the same thread, that details it), gives a
proof for this rule-of-thumb, what more or less rigourous statistical
developments gives consistance to it ?

I mean, for the empirical correlation coefficient,
rhoXiYi = SUM_i=1..N( (x_i - mx).(y_i - my) / sx / sy ) / WHAT_NUMBER
Must WHAT_NUMBER be, for a kind of unbiased estimate ("a kind of" meaning
"with some eventual Fisher z-transform"...):
* N for simplicity,
* N-2 as I have most frequently seen in books that dare give this formula
(N points, minus 1 for position and 1 for dispersion ?),
* or 2N-4 -- 2N for the (x_i,y_i), minus 4 for {mx,my,sx,sy} -- as a
strict application of the rule-of-thumb seems to suggest ?

And what about, when fitting for instance a 3-parameter non-linear
function, reducing the number of degrees of freedom, to N-3 (number of
points, minus one for each function parameter ? I have never read any kind
of explanation to support it, though it seems widely

Thanks in advance for enlightments or simply tracks for other resources of
explanations.
-- Éric L.

* By using the ai-geostats mailing list you agree to follow its rules
( see http://www.ai-geostats.org/help_ai-geostats.htm )

* To unsubscribe to ai-geostats, send the following in the subject or in the body (plain text format) of an email message to sympa@...

Signoff ai-geostats
Your message has been successfully submitted and would be delivered to recipients shortly.