AIGEOSTATS: Spatial statistics and GIS is a Public Group with 159 members.
 AIGEOSTATS: Spatial statistics and GIS

 Public Group,
 159 members
Primary Navigation
RE: [aigeostats] F and Ttest for samples drawn from the same p
 0 Attachment
On 03Dec04 Colin Badenhorst wrote:> Hello everyone,
On the face of it, the scenario you describe corresponds to
>
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a Ftest to test the sample variances,
> and a Ttest to test the group means, at a given confidence limit.
>
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>
> 1. The Ftest suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The Ttest suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>
> Has anyone come across this scenario before and how are they
> interpreted?
a standard ttest (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the Pvalue for the F test is 0.1 or less?)
However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.
There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.
Hoping this helps,
Ted.

EMail: (Ted Harding) <Ted.Harding@...>
Faxtoemail: +44 (0)870 094 0861 [NB: New number!]
Date: 03Dec04 Time: 14:15:09
 XFMail  0 Attachment
Hi Ted,
Thanks for your reply. I suspect my original query was too vague, so I will
illustrate it with a practical example here.
I have an ore horizon that splits into two separate horizons. One of these
split horizons has a lower average grade, and the other has a higher average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations tell
me that these two horizons derive from the same source, and on the face of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as such as
far as grade estimation goes. Because the mean grades vary between the two,
I suspect that the Ttest might fail, but I also suspect that the variance
in grade between the two might be very similar, and thus the Ftest will
pass. Now I have a problem : a Ttest tells me the populations differ
statistically, and but the Ftest tells me they don't.
The confidence limit I refer to in (2) by the way is the Alpha value used to
determine the confidence level for the test  I am using Excel to do the
test.
Thanks,
Colin
Original Message
From: Ted.Harding@... [mailto:Ted.Harding@...]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: aigeostats@...
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
p
On 03Dec04 Colin Badenhorst wrote:
> Hello everyone,
>
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a Ftest to test the sample variances,
> and a Ttest to test the group means, at a given confidence limit.
>
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>
> 1. The Ftest suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The Ttest suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>
> Has anyone come across this scenario before and how are they
> interpreted?
On the face of it, the scenario you describe corresponds to
a standard ttest (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the Pvalue for the F test is 0.1 or less?)
However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.
There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.
Hoping this helps,
Ted.

EMail: (Ted Harding) <Ted.Harding@...>
Faxtoemail: +44 (0)870 094 0861 [NB: New number!]
Date: 03Dec04 Time: 14:15:09
 XFMail  0 Attachment
Standard ttests make two assumptions: 1. both data sets are normally
distributed; 2. they have approximately equal variance. Test these
assumptions before applying a ttest. Violate these assumptions at your
own risk. If you fail either assumption, you need to consider your
options, but probably should not use a plainvanilla ttest. You could
possibly use a data transform to "fix" the first assumption. You might
have to use a modified ttest (such as Satterthwaite's modification) Or
you might consider a nonparametric approach, such as MannWhitney
Utest.
Tim Glover
Senior Environmental Scientist  Geochemistry
Geoenvironmental Department
MACTEC Engineering and Consulting, Inc.
Kennesaw, Georgia, USA
Office 7704213310
Fax 7704213486
Email ntglover@...
Web www.mactec.com
Original Message
From: Colin Badenhorst [mailto:CBadenhorst@...]
Sent: Friday, December 03, 2004 9:59 AM
To: 'ted.harding@...'
Cc: 'aigeostats@...'
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
p
Hi Ted,
Thanks for your reply. I suspect my original query was too vague, so I
will
illustrate it with a practical example here.
I have an ore horizon that splits into two separate horizons. One of
these
split horizons has a lower average grade, and the other has a higher
average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations
tell
me that these two horizons derive from the same source, and on the face
of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as
such as
far as grade estimation goes. Because the mean grades vary between the
two,
I suspect that the Ttest might fail, but I also suspect that the
variance
in grade between the two might be very similar, and thus the Ftest will
pass. Now I have a problem : a Ttest tells me the populations differ
statistically, and but the Ftest tells me they don't.
The confidence limit I refer to in (2) by the way is the Alpha value
used to
determine the confidence level for the test  I am using Excel to do the
test.
Thanks,
Colin
Original Message
From: Ted.Harding@... [mailto:Ted.Harding@...]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: aigeostats@...
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
p
On 03Dec04 Colin Badenhorst wrote:
> Hello everyone,
>
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a Ftest to test the sample variances,
> and a Ttest to test the group means, at a given confidence limit.
>
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>
> 1. The Ftest suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The Ttest suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>
> Has anyone come across this scenario before and how are they
> interpreted?
On the face of it, the scenario you describe corresponds to
a standard ttest (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the Pvalue for the F test is 0.1 or less?)
However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.
There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.
Hoping this helps,
Ted.

EMail: (Ted Harding) <Ted.Harding@...>
Faxtoemail: +44 (0)870 094 0861 [NB: New number!]
Date: 03Dec04 Time: 14:15:09
 XFMail  0 Attachment
RE: [aigeostats] F and Ttest for samples drawn from the same p There is one other very important assumption about these standard statiatical tests  namely that the samples are independent. This typically removes a large part of the usability of basic tests unless corrected for spatial variables. It is most likely the case that your samples within each horizon are not independent (unless the variogram has got zero range) so your typical tests cannot be used. They will tend to give pessimistic results  in other words you will tend to find differences in means when none exists. So, these type of tests don't apply directly.
I don't know if there has been much work on trying to provide 'rigourous' methods (but given that it is impossible to give a statistical test that shows if a random function is stationary or not (Matheron  'Estimating and choosing') then I guess the results would not be completely rigourous). You may be able to get an intuitive feel for the likely difference in means by trying to see how many quasi independent points you have got. You could guesstimate this by assuming that points separated by more than a variogram range are independent and see how many such 'range units' you have got and using this as the number of 'samples' (actually  you may be better by working with an integral range). But if you have any trends in the data then you will not reliable estimates of the two means and so cannot 'prove' that the samples come from the same random function  even if they do.
Regards
Colin Daly
Original Message
From: Glover, Tim [mailto:NTGLOVER@...]
Sent: Fri 12/3/2004 3:15 PM
To: Colin Badenhorst; ted.harding@...
Cc: aigeostats@...
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same p
Standard ttests make two assumptions: 1. both data sets are normally
distributed; 2. they have approximately equal variance. Test these
assumptions before applying a ttest. Violate these assumptions at your
own risk. If you fail either assumption, you need to consider your
options, but probably should not use a plainvanilla ttest. You could
possibly use a data transform to "fix" the first assumption. You might
have to use a modified ttest (such as Satterthwaite's modification) Or
you might consider a nonparametric approach, such as MannWhitney
Utest.
Tim Glover
Senior Environmental Scientist  Geochemistry
Geoenvironmental Department
MACTEC Engineering and Consulting, Inc.
Kennesaw, Georgia, USA
Office 7704213310
Fax 7704213486
Email ntglover@...
Web www.mactec.com
Original Message
From: Colin Badenhorst [mailto:CBadenhorst@...]
Sent: Friday, December 03, 2004 9:59 AM
To: 'ted.harding@...'
Cc: 'aigeostats@...'
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
p
Hi Ted,
Thanks for your reply. I suspect my original query was too vague, so I
will
illustrate it with a practical example here.
I have an ore horizon that splits into two separate horizons. One of
these
split horizons has a lower average grade, and the other has a higher
average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations
tell
me that these two horizons derive from the same source, and on the face
of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as
such as
far as grade estimation goes. Because the mean grades vary between the
two,
I suspect that the Ttest might fail, but I also suspect that the
variance
in grade between the two might be very similar, and thus the Ftest will
pass. Now I have a problem : a Ttest tells me the populations differ
statistically, and but the Ftest tells me they don't.
The confidence limit I refer to in (2) by the way is the Alpha value
used to
determine the confidence level for the test  I am using Excel to do the
test.
Thanks,
Colin
Original Message
From: Ted.Harding@... [mailto:Ted.Harding@...]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: aigeostats@...
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
p
On 03Dec04 Colin Badenhorst wrote:
> Hello everyone,
>
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a Ftest to test the sample variances,
> and a Ttest to test the group means, at a given confidence limit.
>
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>
> 1. The Ftest suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The Ttest suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>
> Has anyone come across this scenario before and how are they
> interpreted?
On the face of it, the scenario you describe corresponds to
a standard ttest (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the Pvalue for the F test is 0.1 or less?)
However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.
There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.
Hoping this helps,
Ted.

EMail: (Ted Harding) <Ted.Harding@...>
Faxtoemail: +44 (0)870 094 0861 [NB: New number!]
Date: 03Dec04 Time: 14:15:09
 XFMail 
 0 Attachment
RE: [aigeostats] F and Ttest for samples drawn from the same p Colin (Daly) is exactly correct. The spatial dependence is the main issue here when you use the ttest for spatial data. You might be able to transform your data for normality or even homogeneity, but the dependence is still there.
In this case, you need to incorporate the spatial dependence (described by variogram) into the ttest. Try the generalized least square for a likelihood approach.
Din Chen
From: Colin Daly [mailto:Colin.Daly@...]
Sent: Friday, December 03, 2004 8:16 AM
To: Glover, Tim; Colin Badenhorst; ted.harding@...
Cc: aigeostats@...
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same pThere is one other very important assumption about these standard statiatical tests  namely that the samples are independent. This typically removes a large part of the usability of basic tests unless corrected for spatial variables. It is most likely the case that your samples within each horizon are not independent (unless the variogram has got zero range) so your typical tests cannot be used. They will tend to give pessimistic results  in other words you will tend to find differences in means when none exists. So, these type of tests don't apply directly.
I don't know if there has been much work on trying to provide 'rigourous' methods (but given that it is impossible to give a statistical test that shows if a random function is stationary or not (Matheron  'Estimating and choosing') then I guess the results would not be completely rigourous). You may be able to get an intuitive feel for the likely difference in means by trying to see how many quasi independent points you have got. You could guesstimate this by assuming that points separated by more than a variogram range are independent and see how many such 'range units' you have got and using this as the number of 'samples' (actually  you may be better by working with an integral range). But if you have any trends in the data then you will not reliable estimates of the two means and so cannot 'prove' that the samples come from the same random function  even if they do.
Regards
Colin Daly
Original Message
From: Glover, Tim [mailto:NTGLOVER@...]
Sent: Fri 12/3/2004 3:15 PM
To: Colin Badenhorst; ted.harding@...
Cc: aigeostats@...
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same p
Standard ttests make two assumptions: 1. both data sets are normally
distributed; 2. they have approximately equal variance. Test these
assumptions before applying a ttest. Violate these assumptions at your
own risk. If you fail either assumption, you need to consider your
options, but probably should not use a plainvanilla ttest. You could
possibly use a data transform to "fix" the first assumption. You might
have to use a modified ttest (such as Satterthwaite's modification) Or
you might consider a nonparametric approach, such as MannWhitney
Utest.
Tim Glover
Senior Environmental Scientist  Geochemistry
Geoenvironmental Department
MACTEC Engineering and Consulting, Inc.
Kennesaw , Georgia , USA
Office 7704213310
Fax 7704213486
Email ntglover@...
Web www.mactec.com
Original Message
From: Colin Badenhorst [mailto:CBadenhorst@...]
Sent: Friday, December 03, 2004 9:59 AM
To: 'ted.harding@...'
Cc: 'aigeostats@...'
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
p
Hi Ted,
Thanks for your reply. I suspect my original query was too vague, so I
will
illustrate it with a practical example here.
I have an ore horizon that splits into two separate horizons. One of
these
split horizons has a lower average grade, and the other has a higher
average
grade. I need to determine whether I should treat these two horizons as
separate entities during grade estimation. My geological observations
tell
me that these two horizons derive from the same source, and on the face
of
it are not different from one another in terms of mineral content and
genesis. I aim to back it up by proving, or attempting to prove, that
statistically these two horizons are the same, and can be treated as
such as
far as grade estimation goes. Because the mean grades vary between the
two,
I suspect that the Ttest might fail, but I also suspect that the
variance
in grade between the two might be very similar, and thus the Ftest will
pass. Now I have a problem : a Ttest tells me the populations differ
statistically, and but the Ftest tells me they don't.
The confidence limit I refer to in (2) by the way is the Alpha value
used to
determine the confidence level for the test  I am using Excel to do the
test.
Thanks,
Colin
Original Message
From: Ted.Harding@... [mailto:Ted.Harding@...]
Sent: 03 December 2004 14:15
To: Colin Badenhorst
Cc: aigeostats@...
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
p
On 03Dec04 Colin Badenhorst wrote:
> Hello everyone,
>
> I have two groups of several thousand samples analysed
> for various elements, and wish to determine if these
> samples are drawn from the same statistical population
> for later variography studies. I propose to test the two
> groups by using a Ftest to test the sample variances,
> and a Ttest to test the group means, at a given confidence limit.
>
> Before I do this, I wonder how I would interpret the results
> of the test if, for example:
>
> 1. The Ftest suggests no significant statistical difference
> between the variances at a 90% confidence limit, BUT
> 2. The Ttest suggests a significant statistical difference
> between the means at the same, or lower confidence limit.
>
> Has anyone come across this scenario before and how are they
> interpreted?
On the face of it, the scenario you describe corresponds to
a standard ttest (which involves an assumption that the
variances of the two populations do not differ), though I'm
not sure what you mean in (2) by significant "at the same,
or lower confidence limit." (Do I take it that in (1) you
mean that the Pvalue for the F test is 0.1 or less?)
However, if you get significant difference between the variances
in (1), then it may not be very good to use the standard
t test (depending on how different they are). A modified
version, such as the Welch test, should be used instead.
There is an issue with interpreting the results where the
samples have initially been screened by one test, before
another one is applied, since the sampling distribution
of the second test, conditional on the outcome of the
first, may not be the same as the sampling distribution of
the second test on its own. However, I feel inclined to
guess that this may not make any important difference
in your case.
Hoping this helps,
Ted.

EMail: (Ted Harding) <Ted.Harding@...>
Faxtoemail: +44 (0)870 094 0861 [NB: New number!]
Date: 03Dec04 Time: 14:15:09
 XFMail 
 0 Attachment
Colin
You need to bear in mind that statistical tests such
as t and F are only testing a very simple hypothesis 
they do not test whether the samples are from the same
population.
The F test is to check whether the standard deviations
differ. If the ore is from the same genesis, it is
likely that the variability will be constant and your
F test will not be significant.
The t test is against the hypothesis that the average
values are the same. That is, one population has a
higher average grade than the other. You can have the
same variability around the mean, but have a zone
where the minerals tend to concentrate at a higher
average.
Even if both tests are not significant, this does not
'prove' that the two populations are the same. You
could have two sets of data with the same mean and
standard deviation and completely different shapes,
for example.
To include the spatial element, you could try a cross
validation approach where one set of samples is the
'actual' values and you try to estimate those from the
other set. This will show up consistent differences in
average between the two as well as differences in
variability.
Strictly, all of the above requires a Normal
distribution but with your nottooskewed data and
thousands of samples, the Central Limit Theorem should
take care of those problems.
Isobel
http://uk.geocities.com/drisobelclark 0 Attachment
Dear all,I'm wondering if sample size (number of samples, n) is playing a role here.Since Colin is using Excel to analyse several thousand samples, I have checked the functions of ttests in Excel. In the Data Analysis Tools help, a function is provided for "tTest: TwoSample Assuming Unequal Variances analysis". This function is the same as those from many text books (There are other forms of the function). Unfortunately, I cannot find the function for "assuming equal variances" in Excel, but I assume they are similar, and should be the same as those from some text books.From the function, you can find that when the sample size is large you always get a large t value. When sample size is large enough, even slight differences between the mean values of two data sets (x bar and y bar) can be detected, and this will result in rejection of the null hypothesis. This is in fact quite reasonable. When the sample size is large, you are confident with the mean values (Central Limit Theorem), with a very small stand error (s/(sqrt(n)). Therefore, you are confident to detect the differences between the two data sets. Even though there is only a slight difference, you can still say, yes, they are "significantly" different.If you still remember some time ago, we had a discussion on large sample size problem for tests for normality. When the sample size is large enough, the result can always be expected (for real data sets), that is, rejection of the null hypothesis.Cheers,Chaosheng
Dr. Chaosheng Zhang
Lecturer in GIS
Department of Geography
National University of Ireland, Galway
IRELAND
Tel: +35391524411 x 2375
Direct Tel: +3539149 2375
Fax: +35391525700
Email: Chaosheng.Zhang@...
Web 1: www.nuigalway.ie/geography/zhang.html
Web 2: www.nuigalway.ie/geography/gis/index.htm
 Original Message From: "Isobel Clark" <drisobelclark@...>To: "Donald E. Myers" <myers@...>Cc: "Colin Badenhorst" <CBadenhorst@...>; <aigeostats@...>Sent: Saturday, December 04, 2004 11:49 AMSubject: [aigeostats] F and Ttest for samples drawn from the same p> Don>
and t
> Thank you for the extended clarification of F> hypothesis test. For those unfamiliar with the
is worth noting that the F test for
> concept, it> multiple means may be more familiar
under the title> "Analysis of variance".
answer was in the context of Colin's
>
> My own brief> question, where it was quite clear
that he was talking> aboutthe simplest F varianceratio and t comparison
of> means test.
>
> Isobel
>
>
> * By using the aigeostats mailing list you agree to follow its rules> ( see
href="http://www.aigeostats.org/help_aigeostats.htm">http://www.aigeostats.org/help_aigeostats.htm )>
following in the subject or in the body (plain text format) of an email message to sympa@...
> * To unsubscribe to aigeostats, send the
>> Signoff
aigeostats>
 0 Attachment
RE: [aigeostats] F and Ttest for samples drawn from the same p Hi
Sorry to repeat myself  but the samples are not independent. Independance is a fundamental assumption of these types of tests  and you cannot interpret the tests if this assumption is violated. In the situation where spatial correlation exists, the true standard error is nothing like as small as the (s/sqrt(n)) that Chaosheng discusses  because the sqrt(n) depends on independence.
Again, as I said before, if the data has any type of trend in it, then it is completely meaningless to try and use these tests  and with no trend but some 'ordinary' correlation, you must find a means of taking the data redundancy into account or risk get hopelessly pessimistic results (in the sense of rejecting the null hypothesis of equal means far too often)
Consider a trivial example. A one dimensional random function which takes constant values over intervals of lenght one  so, it takes the value a_0 in the interval [0,1[ then the value a_1 in the interval [1,2[ and so on (let us suppose that each a_n term is drawn at random from a gaussian distribution with the same mean and variance for example). Next suppose you are given samples on the interval [0,2]. You spot that there seems to be a jump between [0,1[ and [1,2[  so you test for the difference in the means. If you apply an f test you will easily find that the mean differs (and more convincingly the more samples you have drawn!). However by construction of the random function, the mean is not different. We have been lulled into the false conclusion of differing means by assuming that all our data are independent.
Regards
Colin Daly
Original Message
From: Chaosheng Zhang [mailto:Chaosheng.Zhang@...]
Sent: Sun 12/5/2004 11:42 AM
To: aigeostats@...
Cc: Colin Badenhorst; Isobel Clark; Donald E. Myers
Subject: Re: [aigeostats] F and Ttest for samples drawn from the same p
Dear all,
I'm wondering if sample size (number of samples, n) is playing a role here.
Since Colin is using Excel to analyse several thousand samples, I have checked the functions of ttests in Excel. In the Data Analysis Tools help, a function is provided for "tTest: TwoSample Assuming Unequal Variances analysis". This function is the same as those from many text books (There are other forms of the function). Unfortunately, I cannot find the function for "assuming equal variances" in Excel, but I assume they are similar, and should be the same as those from some text books.
From the function, you can find that when the sample size is large you always get a large t value. When sample size is large enough, even slight differences between the mean values of two data sets (x bar and y bar) can be detected, and this will result in rejection of the null hypothesis. This is in fact quite reasonable. When the sample size is large, you are confident with the mean values (Central Limit Theorem), with a very small stand error (s/(sqrt(n)). Therefore, you are confident to detect the differences between the two data sets. Even though there is only a slight difference, you can still say, yes, they are "significantly" different.
If you still remember some time ago, we had a discussion on large sample size problem for tests for normality. When the sample size is large enough, the result can always be expected (for real data sets), that is, rejection of the null hypothesis.
Cheers,
Chaosheng

Dr. Chaosheng Zhang
Lecturer in GIS
Department of Geography
National University of Ireland, Galway
IRELAND
Tel: +35391524411 x 2375
Direct Tel: +3539149 2375
Fax: +35391525700
Email: Chaosheng.Zhang@...
Web 1: www.nuigalway.ie/geography/zhang.html
Web 2: www.nuigalway.ie/geography/gis/index.htm

 Original Message 
From: "Isobel Clark" <drisobelclark@...>
To: "Donald E. Myers" <myers@...>
Cc: "Colin Badenhorst" <CBadenhorst@...>; <aigeostats@...>
Sent: Saturday, December 04, 2004 11:49 AM
Subject: [aigeostats] F and Ttest for samples drawn from the same p
> Don
>
> Thank you for the extended clarification of F and t
> hypothesis test. For those unfamiliar with the
> concept, it is worth noting that the F test for
> multiple means may be more familiar under the title
> "Analysis of variance".
>
> My own brief answer was in the context of Colin's
> question, where it was quite clear that he was talking
> aboutthe simplest F varianceratio and t comparison of
> means test.
>
> Isobel
>
>

> * By using the aigeostats mailing list you agree to follow its rules
> ( see http://www.aigeostats.org/help_aigeostats.htm )
>
> * To unsubscribe to aigeostats, send the following in the subject or in the body (plain text format) of an email message to sympa@...
>
> Signoff aigeostats
>
DISCLAIMER: This message contains information that may be privileged or confidential and is the property of the Roxar Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorised to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message.
 0 Attachment
Hello,
I am currently principal investigator on a major NIH grant
that aims to develop software for test of hypothesis
using alternate hypothesis specified by the user and that
differ from the omnibus "spatial independence";
we called them "spatial neutral models".
For example, you can test for clusters of cancer rates
"above and beyond" a regional background in exposure.
The pvalues are computed using randomization and I applied
geostatistical simulation to generate multiple realizations
that are then used to derive the empirical distribution of
the test statistic.
I presented an example during the last GeoEnv conference
and I put a PDF copy of the paper, which is in press for
the moment, on my website.
Cheers,
Pierre
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
Dr. Pierre Goovaerts
President of PGeostat, LLC
Chief Scientist with Biomedware Inc.
710 Ridgemont Lane
Ann Arbor, Michigan, 481031535, U.S.A.
Email: goovaert@...
Phone: (734) 6689900
Fax: (734) 6687788
http://alumni.engin.umich.edu/~goovaert/
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
On Sun, 5 Dec 2004, Colin Daly wrote:
>
>
> Hi
>
> Sorry to repeat myself  but the samples are not independent. Independance is a fundamental assumption of these types of tests  and you cannot interpret the tests if this assumption is violated. In the situation where spatial correlation exists, the true standard error is nothing like as small as the (s/sqrt(n)) that Chaosheng discusses  because the sqrt(n) depends on independence.
>
> Again, as I said before, if the data has any type of trend in it, then it is completely meaningless to try and use these tests  and with no trend but some 'ordinary' correlation, you must find a means of taking the data redundancy into account or risk get hopelessly pessimistic results (in the sense of rejecting the null hypothesis of equal means far too often)
>
> Consider a trivial example. A one dimensional random function which takes constant values over intervals of lenght one  so, it takes the value a_0 in the interval [0,1[ then the value a_1 in the interval [1,2[ and so on (let us suppose that each a_n term is drawn at random from a gaussian distribution with the same mean and variance for example). Next suppose you are given samples on the interval [0,2]. You spot that there seems to be a jump between [0,1[ and [1,2[  so you test for the difference in the means. If you apply an f test you will easily find that the mean differs (and more convincingly the more samples you have drawn!). However by construction of the random function, the mean is not different. We have been lulled into the false conclusion of differing means by assuming that all our data are independent.
>
> Regards
>
> Colin Daly
>
>
> Original Message
> From: Chaosheng Zhang [mailto:Chaosheng.Zhang@...]
> Sent: Sun 12/5/2004 11:42 AM
> To: aigeostats@...
> Cc: Colin Badenhorst; Isobel Clark; Donald E. Myers
> Subject: Re: [aigeostats] F and Ttest for samples drawn from the same p
> Dear all,
>
>
>
> I'm wondering if sample size (number of samples, n) is playing a role here.
>
>
>
> Since Colin is using Excel to analyse several thousand samples, I have checked the functions of ttests in Excel. In the Data Analysis Tools help, a function is provided for "tTest: TwoSample Assuming Unequal Variances analysis". This function is the same as those from many text books (There are other forms of the function). Unfortunately, I cannot find the function for "assuming equal variances" in Excel, but I assume they are similar, and should be the same as those from some text books.
>
>
>
> From the function, you can find that when the sample size is large you always get a large t value. When sample size is large enough, even slight differences between the mean values of two data sets (x bar and y bar) can be detected, and this will result in rejection of the null hypothesis. This is in fact quite reasonable. When the sample size is large, you are confident with the mean values (Central Limit Theorem), with a very small stand error (s/(sqrt(n)). Therefore, you are confident to detect the differences between the two data sets. Even though there is only a slight difference, you can still say, yes, they are "significantly" different.
>
>
>
> If you still remember some time ago, we had a discussion on large sample size problem for tests for normality. When the sample size is large enough, the result can always be expected (for real data sets), that is, rejection of the null hypothesis.
>
>
>
> Cheers,
>
>
>
> Chaosheng
>
> 
>
> Dr. Chaosheng Zhang
>
> Lecturer in GIS
>
> Department of Geography
>
> National University of Ireland, Galway
>
> IRELAND
>
> Tel: +35391524411 x 2375
>
> Direct Tel: +3539149 2375
>
> Fax: +35391525700
>
> Email: Chaosheng.Zhang@...
>
> Web 1: www.nuigalway.ie/geography/zhang.html
>
> Web 2: www.nuigalway.ie/geography/gis/index.htm
>
> 
>
>
>
>
>
>  Original Message 
>
> From: "Isobel Clark" <drisobelclark@...>
>
> To: "Donald E. Myers" <myers@...>
>
> Cc: "Colin Badenhorst" <CBadenhorst@...>; <aigeostats@...>
>
> Sent: Saturday, December 04, 2004 11:49 AM
>
> Subject: [aigeostats] F and Ttest for samples drawn from the same p
>
>
>
>
>
> > Don
>
> >
>
> > Thank you for the extended clarification of F and t
>
> > hypothesis test. For those unfamiliar with the
>
> > concept, it is worth noting that the F test for
>
> > multiple means may be more familiar under the title
>
> > "Analysis of variance".
>
> >
>
> > My own brief answer was in the context of Colin's
>
> > question, where it was quite clear that he was talking
>
> > aboutthe simplest F varianceratio and t comparison of
>
> > means test.
>
> >
>
> > Isobel
>
> >
>
> >
>
>
>
>
>
> 
>
>
>
>
>
> > * By using the aigeostats mailing list you agree to follow its rules
>
> > ( see http://www.aigeostats.org/help_aigeostats.htm )
>
> >
>
> > * To unsubscribe to aigeostats, send the following in the subject or in the body (plain text format) of an email message to sympa@...
>
> >
>
> > Signoff aigeostats
>
> >
>
>
>
>
> DISCLAIMER:
> This message contains information that may be privileged or confidential and is the property of the Roxar Group. It is intended only for the person to whom it is addressed. If you are not the intended recipient, you are not authorised to read, print, retain, copy, disseminate, distribute, or use this message or any part thereof. If you receive this message in error, please notify the sender immediately and delete all copies of this message. 0 Attachment
Colin,
Isn't a basic rule of geostatisitics that all populations must follow the
intrinsic
hypothesis, i.e. stationarity ,constant mean and variance, so you should
split
any populations that do not have the same mean and variance, introduced
pp33 Mining Geostatistics A.G.Journel & Ch. J.Huijbregts.
Regards Digby
 Original Message 
From: "Colin Badenhorst" <CBadenhorst@...>
To: <ted.harding@...>
Cc: <aigeostats@...>
Sent: Saturday, December 04, 2004 1:28 AM
Subject: RE: [aigeostats] F and Ttest for samples drawn from the same p
> Hi Ted,
>
> Thanks for your reply. I suspect my original query was too vague, so I
> will
> illustrate it with a practical example here.
>
> I have an ore horizon that splits into two separate horizons. One of these
> split horizons has a lower average grade, and the other has a higher
> average
> grade. I need to determine whether I should treat these two horizons as
> separate entities during grade estimation. My geological observations tell
> me that these two horizons derive from the same source, and on the face of
> it are not different from one another in terms of mineral content and
> genesis. I aim to back it up by proving, or attempting to prove, that
> statistically these two horizons are the same, and can be treated as such
> as
> far as grade estimation goes. Because the mean grades vary between the
> two,
> I suspect that the Ttest might fail, but I also suspect that the variance
> in grade between the two might be very similar, and thus the Ftest will
> pass. Now I have a problem : a Ttest tells me the populations differ
> statistically, and but the Ftest tells me they don't.
>
> The confidence limit I refer to in (2) by the way is the Alpha value used
> to
> determine the confidence level for the test  I am using Excel to do the
> test.
>
> Thanks,
> Colin
>
>
> Original Message
> From: Ted.Harding@... [mailto:Ted.Harding@...]
> Sent: 03 December 2004 14:15
> To: Colin Badenhorst
> Cc: aigeostats@...
> Subject: RE: [aigeostats] F and Ttest for samples drawn from the same
> p
>
>
> On 03Dec04 Colin Badenhorst wrote:
>> Hello everyone,
>>
>> I have two groups of several thousand samples analysed
>> for various elements, and wish to determine if these
>> samples are drawn from the same statistical population
>> for later variography studies. I propose to test the two
>> groups by using a Ftest to test the sample variances,
>> and a Ttest to test the group means, at a given confidence limit.
>>
>> Before I do this, I wonder how I would interpret the results
>> of the test if, for example:
>>
>> 1. The Ftest suggests no significant statistical difference
>> between the variances at a 90% confidence limit, BUT
>> 2. The Ttest suggests a significant statistical difference
>> between the means at the same, or lower confidence limit.
>>
>> Has anyone come across this scenario before and how are they
>> interpreted?
>
> On the face of it, the scenario you describe corresponds to
> a standard ttest (which involves an assumption that the
> variances of the two populations do not differ), though I'm
> not sure what you mean in (2) by significant "at the same,
> or lower confidence limit." (Do I take it that in (1) you
> mean that the Pvalue for the F test is 0.1 or less?)
>
> However, if you get significant difference between the variances
> in (1), then it may not be very good to use the standard
> t test (depending on how different they are). A modified
> version, such as the Welch test, should be used instead.
>
> There is an issue with interpreting the results where the
> samples have initially been screened by one test, before
> another one is applied, since the sampling distribution
> of the second test, conditional on the outcome of the
> first, may not be the same as the sampling distribution of
> the second test on its own. However, I feel inclined to
> guess that this may not make any important difference
> in your case.
>
> Hoping this helps,
> Ted.
>
>
> 
> EMail: (Ted Harding) <Ted.Harding@...>
> Faxtoemail: +44 (0)870 094 0861 [NB: New number!]
> Date: 03Dec04 Time: 14:15:09
>  XFMail 
>
>
>

>* By using the aigeostats mailing list you agree to follow its rules
> ( see http://www.aigeostats.org/help_aigeostats.htm )
>
> * To unsubscribe to aigeostats, send the following in the subject or in
> the body (plain text format) of an email message to sympa@...
>
> Signoff aigeostats 0 Attachment
Every resource model I have done, I always subdivide the populations into
those of equal mean and variance, so stationarity is obeyed, is this the
correct
procedure, I havn't read Mining Geostatisitcs in detail yet, but understood
that this was a basic requirement for geostatisitical modelling procedures.
http://www.users.on.net/~digbym/about_consulting.htm
Digby 0 Attachment
RE: [aigeostats] F and Ttest for samples drawn from the same p Besides the discussions on the theory, I think we need a practical solution for Colin Badenhorst's initial problem (This is not his problem only). He wants to compare two sets of spatial data with several thousand samples.Spatial autocorrelation (or lack of independence) is a basic feature of spatial data, and thus we cannot do anything to ask spatial data to behave well to satisfy the statistical requirements. If your spatial data set is lack of spatial autocorrelation, you may be asked to go back and take more samples. The ideal way is perhaps to develop a ttest (or whatever test) for spatial data, something like "spatially weighted test". If such a test is not available, we have no choice, but have to use existing methods. They may not be exactly suitable to spatial data, but better than nothing.For the time being, the best way to solve the problem is still to use statistical methods, but try to explain the results carefully and appropriately. We have to acknowledge the discrepancies between the basic feature of spatial data and possible statistical requirements. Meanwhile, when the sample size (well, going back to my initial concern) is large, you will always get the result of rejecting the null hypothesis for REAL data, no matter there is spatial dependence or not. In this case, what does such a result mean? I would like to say this result is not very meaningful, as it just proves the power of statistical tests. The simple ways of graphs (e.g., histogram, boxplot) and percentiles may become helpful for comparison.Therefore, for Colin's initial problem, the solution is to explain the results properly, and maybe to try some other methods if available.Cheers,Chaosheng
Dr. Chaosheng Zhang
Lecturer in GIS
Department of Geography
National University of Ireland, Galway
IRELAND
Tel: +35391524411 x 2375
Direct Tel: +3539149 2375
Fax: +35391525700
Email: Chaosheng.Zhang@...
Web 1: www.nuigalway.ie/geography/zhang.html
Web 2: www.nuigalway.ie/geography/gis/index.htm
 0 Attachment
Dear all
I am having difficulty understanding why none of you
want to try a spatial approach to statistics. Everyone
is trying to make the 'independent' statistical tests
work on spatial data. Try turning this around and look
at the spatial aspect first.
(1) Testing variances: the sill on the semivariogram
(total height of model) is theoretically a good
estimate for the sample variance when autocorrelation
or spatial dependence is present. Do your F test on
that. Yes, you still have degrees of freedom problems,
but with thousands of samples the 'infinity column'
should be sufficient.
(2) Testing means: the classic ttest in the presence
of 'equal variances' requires the 'standard error' of
each mean. For independent samples, this is s/sqrt(n).
For spatially dependent samples, this is the kriging
standard error for the global mean. Your only problem
then is getting a global standard error.
Isobel
http://geoecosse.bizland.com/whatsnew.htm 0 Attachment
Isobel,
Good idea, and that's a step forward. Any references or is it still an idea?
Cheers,
Chaosheng
 Original Message 
From: "Isobel Clark" <drisobelclark@...>
To: "AI Geostats mailing list" <aigeostats@...>
Sent: Monday, December 06, 2004 1:07 PM
Subject: Re: [aigeostats] F and Ttest for samples drawn from the same p
> Dear all
>
> I am having difficulty understanding why none of you
> want to try a spatial approach to statistics. Everyone
> is trying to make the 'independent' statistical tests
> work on spatial data. Try turning this around and look
> at the spatial aspect first.
>
> (1) Testing variances: the sill on the semivariogram
> (total height of model) is theoretically a good
> estimate for the sample variance when autocorrelation
> or spatial dependence is present. Do your F test on
> that. Yes, you still have degrees of freedom problems,
> but with thousands of samples the 'infinity column'
> should be sufficient.
>
> (2) Testing means: the classic ttest in the presence
> of 'equal variances' requires the 'standard error' of
> each mean. For independent samples, this is s/sqrt(n).
> For spatially dependent samples, this is the kriging
> standard error for the global mean. Your only problem
> then is getting a global standard error.
>
> Isobel
> http://geoecosse.bizland.com/whatsnew.htm
>
>


> * By using the aigeostats mailing list you agree to follow its rules
> ( see http://www.aigeostats.org/help_aigeostats.htm )
>
> * To unsubscribe to aigeostats, send the following in the subject or in
the body (plain text format) of an email message to sympa@...
>
> Signoff aigeostats
> 0 Attachment
There ws a pretty good paper on global standard errors
in the 1984 APCOM proceedings, so I am sure it should
be in the major textbooks by now.
Commparing the sills is very straightforward, I think.
Isobel
http://geecosse.bizland.com/books.htm
 Chaosheng Zhang <Chaosheng.Zhang@...>
wrote:> Isobel,

>
> Good idea, and that's a step forward. Any references
> or is it still an idea?
>
> Cheers,
>
> Chaosheng
>
>  Original Message 
> From: "Isobel Clark" <drisobelclark@...>
> To: "AI Geostats mailing list" <aigeostats@...>
> Sent: Monday, December 06, 2004 1:07 PM
> Subject: Re: [aigeostats] F and Ttest for samples
> drawn from the same p
>
>
> > Dear all
> >
> > I am having difficulty understanding why none of
> you
> > want to try a spatial approach to statistics.
> Everyone
> > is trying to make the 'independent' statistical
> tests
> > work on spatial data. Try turning this around and
> look
> > at the spatial aspect first.
> >
> > (1) Testing variances: the sill on the
> semivariogram
> > (total height of model) is theoretically a good
> > estimate for the sample variance when
> autocorrelation
> > or spatial dependence is present. Do your F test
> on
> > that. Yes, you still have degrees of freedom
> problems,
> > but with thousands of samples the 'infinity
> column'
> > should be sufficient.
> >
> > (2) Testing means: the classic ttest in the
> presence
> > of 'equal variances' requires the 'standard error'
> of
> > each mean. For independent samples, this is
> s/sqrt(n).
> > For spatially dependent samples, this is the
> kriging
> > standard error for the global mean. Your only
> problem
> > then is getting a global standard error.
> >
> > Isobel
> > http://geoecosse.bizland.com/whatsnew.htm
> >
> >
>
>
>
> 
to
>
>
> > * By using the aigeostats mailing list you agree
> to follow its rules
> > ( see
> http://www.aigeostats.org/help_aigeostats.htm )
> >
> > * To unsubscribe to aigeostats, send the
> following in the subject or in
> the body (plain text format) of an email message to
> sympa@...
> >
> > Signoff aigeostats
> >
>
>
> > * By using the aigeostats mailing list you agree
> follow its rules
> ( see
> http://www.aigeostats.org/help_aigeostats.htm )
>
> * To unsubscribe to aigeostats, send the following
> in the subject or in the body (plain text format) of
> an email message to sympa@...
>
> Signoff aigeostats 0 Attachment
RE: [aigeostats] F and Ttest for samples drawn from the same pComparisons
of the sills of relative variograms may indicate wether the proportional
effect is present
between the low and high grade zones, so a test on the correlation
coefficients could be relevant.
Digby
www.users.on.net/~digbym