Loading ...
Sorry, an error occurred while loading the content.

Re: Confidence level

Expand Messages
  • HoopStudies
    ... research? It s not typical. 95% is more typical. That doesn t necessarily mean it s wrong, it just means results are less reliable at the 90th percentile.
    Message 1 of 4 , Mar 2, 2002
    • 0 Attachment
      --- In APBR_analysis@y..., "dlirag" <dlirag@h...> wrote:
      > Would it be okay to use 90% as a level of significance in our
      research?

      It's not typical. 95% is more typical.

      That doesn't necessarily mean it's wrong, it just means results are
      less reliable at the 90th percentile. Results at the 95th percentile
      aren't fully reliable either.

      I once worked on a case where someone used 80% as a level of
      significance. They tested about 20 wells (they were looking for
      significant trends in concentration data). And, guess what? They
      found 4 of those wells to have statistically significant trends at
      80%. They insisted that something real was happening in those wells
      based solely on a test that, in light of the results (20% of wells
      showing a trend when that would be randomly expected), showed nothing.

      I generally tell people: Use what you want. To the degree you want
      people to listen to you, use a greater significance level.

      If I personally say something is statistically significant, without
      mentioning the level of significance, I am using 95%. Just as a
      reference.

      DeanO
    • Michael K. Tamada
      ... Good advice, there s one thing that should be added however: sometimes we re working with small samples here, and one has to be aware of the possibility
      Message 2 of 4 , Mar 3, 2002
      • 0 Attachment
        On Sat, 2 Mar 2002, HoopStudies wrote:

        > --- In APBR_analysis@y..., "dlirag" <dlirag@h...> wrote:
        > > Would it be okay to use 90% as a level of significance in our
        > research?
        >
        > It's not typical. 95% is more typical.
        >
        > That doesn't necessarily mean it's wrong, it just means results are
        > less reliable at the 90th percentile. Results at the 95th percentile
        > aren't fully reliable either.
        >
        > I once worked on a case where someone used 80% as a level of
        > significance. They tested about 20 wells (they were looking for
        > significant trends in concentration data). And, guess what? They
        > found 4 of those wells to have statistically significant trends at
        > 80%. They insisted that something real was happening in those wells
        > based solely on a test that, in light of the results (20% of wells
        > showing a trend when that would be randomly expected), showed nothing.
        >
        > I generally tell people: Use what you want. To the degree you want
        > people to listen to you, use a greater significance level.
        >
        > If I personally say something is statistically significant, without
        > mentioning the level of significance, I am using 95%. Just as a
        > reference.

        Good advice, there's one thing that should be added however: sometimes
        we're working with small samples here, and one has to be aware of the
        possibility of Type II errors, which are more likely with small sample
        sizes.

        Type I error: incorrectly reject the null hypothesis, even though it's
        true. Also known as the "size" of the test (why statisticians call it
        that, I don't know, but that's the name).

        If we use a 90% significance level, there is a 10% probability of a Type I
        error (if the null hypothesis is indeed true).

        If we use a 95% significance level, there is a 5% probability of a Type I
        error. Etc.

        That is why DeanO is saying that the more rigourous (95% or even 99%, i.e.
        5% or 1% size tests) are more reliable.


        Type II error: incorrectly accept the null hypothesis, even though it's
        false. Also known as the "power" of the test.

        The actual power of a test is essentially impossible to know, because we'd
        need to know what the true values of the parameters are. But we can
        create power functions, which calculate the power of the test for a range
        of parameter values. More power is better; unfortunately, there is a
        direct tradeoff between size and power. To use a 99% (i.e. size=1%) test
        means a low probability of a Type I error, but a higher probability of a
        Type II error.


        Example: if we're looking for evidence of a streak shooter, we might look
        at last night's games and see that Player X was 2-4 on free throws in the
        first half and 2-2 in the second half. Maybe we think he had a hot
        streak going in the second half, and that's why he was 100% on his FTs.

        We could try to see if his second half FT shooting was "significantly"
        better than his first half shooting, but with sample sizes of 4 and 2
        respectively, there is almost no chance of being able to reject the null
        hypothesis (of equal shooting) at the 5%, 10%, or any reasonable level.

        So we could run tests, and we would almost certainly not be able to reject
        the null hypothesis. But maybe hot streaks do exist ... but with effects
        too small to be measured with such small sample sizes.


        With huge sample sizes, the problem becomes the opposite. Most anything
        will show a statistically significant difference if the sample size
        becomes large enough. But the question then becomes whether these
        differences are significant in terms of: are they big enough to make a
        practical difference? Maybe during hot streaks players shoot .001 better
        than during non-hot streaks. It'd take a huge sample to be able to detect
        that difference, and we would probably say that the difference is small
        enough to ignore, and for practical purposes hot streaks do not exist.

        But conversely with a small sample, there may be a real effect there, but
        it may simply be one that we haven't found, due to our small samples. So
        we must be cautious about drawing conclusions with small samples. Just
        because no significant effect has been found doesn't mean that the effect
        doesn't exist.


        --MKT
      • Dennis Keefe
        ... As an intuitive tool, when reporting and comparing averages, I think reporting 95% confidence intervals for the averages would serve a useful
        Message 3 of 4 , Mar 3, 2002
        • 0 Attachment
          --- "Michael K. Tamada" <tamada@...> wrote:
          > real effect there, but
          > it may simply be one that we haven't found, due to
          > our small samples. So
          > we must be cautious about drawing conclusions with
          > small samples. Just
          > because no significant effect has been found doesn't
          > mean that the effect
          > doesn't exist.
          > --MKT

          As an intuitive tool, when reporting and comparing
          averages, I think reporting 95% confidence intervals
          for the averages would serve a useful
          function--they're easy to interpret (according to me)
          and I think they serve as a good reminder that we're
          often dealing with samples and trying to estimate
          parameters with some error involved.

          >
          >
          > ------------------------ Yahoo! Groups Sponsor
          >
          > To unsubscribe from this group, send an email to:
          > APBR_analysis-unsubscribe@yahoogroups.com
          >
          >
          >
          > Your use of Yahoo! Groups is subject to
          > http://docs.yahoo.com/info/terms/
          >
          >


          __________________________________________________
          Do You Yahoo!?
          Yahoo! Sports - sign up for Fantasy Baseball
          http://sports.yahoo.com
        Your message has been successfully submitted and would be delivered to recipients shortly.