Loading ...
Sorry, an error occurred while loading the content.

Statistical Difference in HRs hit

Expand Messages
  • btibert3
    Hi guys, I apologize for the stat-based question. If this is not the appropriate place, if you could point me in the right direction I would greatly apprecate
    Message 1 of 2 , Nov 19, 2007
    • 0 Attachment
      Hi guys,

      I apologize for the stat-based question. If this is not the
      appropriate place, if you could point me in the right direction I
      would greatly apprecate it.

      I have had some statistics courses a while back, but I am trying to
      figure out what statistical method should be used if you wanted to
      evalute if there is a difference in the number of homeruns being
      hit. A basic example would be the number of home runs hit in 2007
      versus 2006. Extending it further, you could look at a team relative
      to the remainder of the league.

      From what I understand, the number of HRs hit could be modeled as
      continuous variable (basic tests of means), but I also look at it as
      a discrete variable. What tests should you use if you were to look at
      the difference between two counts? What about if you looked at it as
      a rate (HR/game)? Can you perform hypothesis tests about a rate?

      I have seen a lot of great work applying statistics to baseball data
      on a number of blogs, so I am wondering what method would be
      appropriate. I feel like I have just enough exposure to stats to
      know that you can test for significant differences, but I am not sure
      what methods are appropriate.

      Again, I apologize if this post is beyond the scope of this forum.

      Many thanks in advance,

      - Brock
    • Robert Ehrlich
      Brock: There are a number of ways to skin your cat. One of the oldest and easiest to understand is the chi-square contingency table . In your case you would
      Message 2 of 2 , Nov 20, 2007
      • 0 Attachment
        Brock:
        There are a number of ways to skin your cat. One of the oldest and
        easiest to understand is the "chi-square contingency table".
        In your case you would construct a table with two columns "number of
        homers" and the number of "non homers". homers plus the non homers
        might equal the total number of hits, or maybe at bats, etc.The rows in
        this table are labeled 2006 and 2007 respectively.

        You then sum across each row--the sums represent the total number of
        homer plus the rest for each year. You then sum down each row--this
        yields the numbers of homers hit in both years and the number of
        nor-homers hit in both years.
        These sums are known as :marginal totals. because the can be entered at
        the end of every row and every column. If we now sum all of the row
        totals ( or column totals) we obtain the grand total of all hits plus
        non hits. this total is usually entered in the lower right portion of
        your table.

        From the values of the marginal totals you can then calculate the
        chi-square statistic and look up the value in a table to determine
        whether or not the homers differ significantly. You can ads as many
        rows as you like (e.g. several years) and calculate chi square.

        If you want to know the logic behind chi square and the formula send
        me an email. Otherwise you can look it up on line or in a stats. book.

        Bob Ehrlich



        btibert3 wrote:

        > Hi guys,
        >
        > I apologize for the stat-based question. If this is not the
        > appropriate place, if you could point me in the right direction I
        > would greatly apprecate it.
        >
        > I have had some statistics courses a while back, but I am trying to
        > figure out what statistical method should be used if you wanted to
        > evalute if there is a difference in the number of homeruns being
        > hit. A basic example would be the number of home runs hit in 2007
        > versus 2006. Extending it further, you could look at a team relative
        > to the remainder of the league.
        >
        > >From what I understand, the number of HRs hit could be modeled as
        > continuous variable (basic tests of means), but I also look at it as
        > a discrete variable. What tests should you use if you were to look at
        > the difference between two counts? What about if you looked at it as
        > a rate (HR/game)? Can you perform hypothesis tests about a rate?
        >
        > I have seen a lot of great work applying statistics to baseball data
        > on a number of blogs, so I am wondering what method would be
        > appropriate. I feel like I have just enough exposure to stats to
        > know that you can test for significant differences, but I am not sure
        > what methods are appropriate.
        >
        > Again, I apologize if this post is beyond the scope of this forum.
        >
        > Many thanks in advance,
        >
        > - Brock
        >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.