Reza,

If you are just asking why n-1 in the formula commonly found in stat books for computing the sample variance s^2, it is so that we have an unbiased estimate of the population variance – look at a good calculus based probability and stat book.

Other estimation methods (e.g., maximum likelihood) divide by n instead of n-1.

Oh, while the n-1 does make the sample variance s^2 an unbiased estimate of the population variance sigma^2, taking the square root and getting the sample standard deviation s does not result in an unbiased estimated of the population standard deviation sigma. Another reason some prefer m.l.e.

Best,

Bill

--

William V Harper, Mathematical Sciences

Otterbein College, Towers Hall 139, 1 Otterbein College

Westerville OH 43081-2006 USA

Office phone: 614-823-1417 Office Fax 614-823-3201

Faculty page: http://www.otterbein.edu/home/fac/WLLVHRPR

For the best in geostatistics: http://geoecosse.hypermart.net/

**From:**Reza Nazarian [mailto:rnazarian@...]

**Sent:**Thursday, August 25, 2005 3:23 PM

**To:**ai-geostats@...

**Subject:**[ai-geostats] Why degree of freedom is n-1Dear Experts

Sorry may be the question is so basic .After searching my statistics books to find an answer with no great success, could you please explain me why we consider degree of freedom as**n-1**in calculating variance. Thanks for your kind advises.

Very Best Regards

Reza Nazarian

Schlumberger Information Solutions

SONILS Oil Services Centre, Porto de Luanda , Angola

(Via UK : +44 (0)207 576 6306

* rnazarian@...

__http://www.sis.slb.com__- This follow-up is slighlty aside the subject line of the mailing list, but

as a geologist, this is the only statistically-flavoured one I am

subscribed to. Therefore :

Federico Pardo <federico.pardo@...> said:> Having N samples, and then n degrees of freedom.

Apart a rigorous calculation I am aware of that in this very case (cf.

> One degree of freedom is used (or taken) by the mean calculation.

> Then when you calculate the variance or the standard deviation, you only

> have left n-1 degrees of freedom.

Peter Bossew's contribution on the same thread, that details it), gives a

proof for this rule-of-thumb, what more or less rigourous statistical

developments gives consistance to it ?

I mean, for the empirical correlation coefficient,

rhoXiYi = SUM_i=1..N( (x_i - mx).(y_i - my) / sx / sy ) / WHAT_NUMBER

Must WHAT_NUMBER be, for a kind of unbiased estimate ("a kind of" meaning

"with some eventual Fisher z-transform"...):

* N for simplicity,

* N-2 as I have most frequently seen in books that dare give this formula

(N points, minus 1 for position and 1 for dispersion ?),

* or 2N-4 -- 2N for the (x_i,y_i), minus 4 for {mx,my,sx,sy} -- as a

strict application of the rule-of-thumb seems to suggest ?

And what about, when fitting for instance a 3-parameter non-linear

function, reducing the number of degrees of freedom, to N-3 (number of

points, minus one for each function parameter ? I have never read any kind

of explanation to support it, though it seems widely

Thanks in advance for enlightments or simply tracks for other resources of

explanations.

-- Éric L. - Dear Reza

I was away from my office for quite a while. After surfing my folder, I

came across your enquiry. I found it helpful to share the following

thoughts with you and other colleagues over the list.

I prefer to approach your question from another angle.

At first, one has to acknowledge that almost all measurements are

corrupted by noise in one way or another. Furthermore, standard deviation is a

measure uncertainty in measurement. Now, keeping These points in mind, look

at the relation for calculating the standard deviation or for that matter

variance when you have only ONE measurement. If you use

the relation with n in the denominator, then you would get 0 for standard

deviation implying your single measurement is exact and not corrupted by

noise which is not true. On the other hand, relation with n-1 in the

denominator would give you 0/0 which is indeterminate more compatible with

preliminary propositions mentioned above.

Another useful question might be the origin of that equation which has

something to do with Normal probability distribution. The first chapter of

"Nonlinear parameter estimation by Bard (1974)" might be useful to refer

to as he was resorting to Entropy to derive Normal distribution and its

associated parameters.

Hope this helps.

Thanks

Abedini

On Thu, 25 Aug 2005, Reza Nazarian wrote:

> Dear Experts

> Sorry may be the question is so basic .After searching my statistics books to

> find an answer with no great success, could you please explain me why we

> consider degree of freedom as n-1 in calculating variance. Thanks for your

> kind advises.

>

>

> Very Best Regards

> Reza Nazarian

> Schlumberger Information Solutions

> SONILS Oil Services Centre, Porto de Luanda, Angola

>

> (Via UK: +44 (0)207 576 6306

> * rnazarian@...

> http://www.sis.slb.com

>