Loading ...
Sorry, an error occurred while loading the content.

Maximum standard deviations (math help needed)

Expand Messages
  • monepeterson
    First of all, please excuse the inarticulate nature with which I m about to explain my problem: I m tracking team performances from year to year in the NBA and
    Message 1 of 5 , Nov 27, 2003
    • 0 Attachment
      First of all, please excuse the inarticulate nature with which I'm
      about to explain my problem:




      I'm tracking team performances from year to year in the NBA and NHL.




      The methods I'm using are simple but have some problems. The first is
      winning percentage minus the mean (.500) divided by the standard
      deviation for winning percentages that year. The second is point (or
      goal) differential divided by the standard deviation for point
      differential that year. Rob Neyer and Eddie Epstein did this for their
      "Baseball Dynasties" book, although they made the mistake of doing
      seperate deviations for runs scored and runs allowed and then adding
      them together.




      I know that the maximum standard deviations for any team in any given
      year is square root (n-1) where n = the number of teams in the league
      that year, and I've factored that in.




      But there's another factor, and I don't know how to resolve it.
      There's also a maximum SD possible for a team depending on how high or
      low the league standard deviation is. Um, right?




      For winning percentage, I think it's one divided by the league SD. Is
      that right?




      But for linear numbers like point and goal differential, I have no
      clue where to start. In 1976, the standard deviation for point
      differential in the league was historically low, making the Golden
      State Warriors of that year look like one of the best teams of all
      time. While the Warriors were really good, they have no control over
      how even the rest of the league is, so I'm trying to account for that.
      Is there a way to account for that? Preferably in the form of a
      formula I can toss into Excel?




      Hope the question is clear.




      Moné
    • igor eduardo küpfer
      ... From: monepeterson To: Sent: Thursday, November 27, 2003 3:38 PM Subject: [APBR_analysis] Maximum
      Message 2 of 5 , Nov 28, 2003
      • 0 Attachment
        ----- Original Message -----
        From: "monepeterson" <mone@...>
        To: <APBR_analysis@yahoogroups.com>
        Sent: Thursday, November 27, 2003 3:38 PM
        Subject: [APBR_analysis] Maximum standard deviations (math help needed)
        >
        >First of all, please excuse the inarticulate nature with which I'm
        >about to explain my problem:
        >
        >I'm tracking team performances from year to year in the NBA and NHL.
        >
        >The methods I'm using are simple but have some problems. The first is
        >winning percentage minus the mean (.500) divided by the standard
        >deviation for winning percentages that year. The second is point (or
        >goal) differential divided by the standard deviation for point
        >differential that year. Rob Neyer and Eddie Epstein did this for their
        >"Baseball Dynasties" book, although they made the mistake of doing
        >seperate deviations for runs scored and runs allowed and then adding
        >them together.
        >

        Point differential is directly related to winning percentages, a la the
        Pythagorean method. Since the Pyth is scaled to the overall points scored
        environment, you'd probably be better off using that instead of pts diff.

        >I know that the maximum standard deviations for any team in any given
        >year is square root (n-1) where n = the number of teams in the league
        >that year, and I've factored that in.
        >

        I'm not sure what you mean here. Standard deviations are a measure of
        variation, which is a property of groups (like a league), not individual
        teams. You can have a maximum possible SD of winning percentages for a
        league (~ .500 I think for a 29 team league), but there is no maximum
        possible SD for a team.

        >But there's another factor, and I don't know how to resolve it.
        >There's also a maximum SD possible for a team depending on how high or
        >low the league standard deviation is. Um, right?
        >

        I think I see what you mean now. Are you using "standard deviation" to mean
        the result to your equation (win% - .500)/SD ? If so, the maximum possible
        result in an 82-game schedule is a little more than 5. (That number comes
        from a 29 team league in which one team has 82 wins, 15 teams have 40 wins,
        and 13 teams have 39 wins. That is, a league with a low SD and one team with
        an extreme win%.)

        >For winning percentage, I think it's one divided by the league SD. Is
        >that right?
        >
        >But for linear numbers like point and goal differential, I have no
        >clue where to start. In 1976, the standard deviation for point
        >differential in the league was historically low, making the Golden
        >State Warriors of that year look like one of the best teams of all
        >time. While the Warriors were really good, they have no control over
        >how even the rest of the league is, so I'm trying to account for that.
        >Is there a way to account for that? Preferably in the form of a
        >formula I can toss into Excel?
        >
        >Hope the question is clear.
        >
        >Moné
        >

        Perhaps you can tell us why you need to know the maximum SD?


        ed
      • Michael Tamada
        ... From: igor eduardo küpfer [mailto:igorkupfer@rogers.com] Sent: Friday, November 28, 2003 9:13 PM ... To: Sent: Thursday,
        Message 3 of 5 , Dec 3, 2003
        • 0 Attachment
          -----Original Message-----
          From: igor eduardo küpfer [mailto:igorkupfer@...]
          Sent: Friday, November 28, 2003 9:13 PM


          ----- Original Message -----
          >From: "monepeterson" <mone@...>
          To: <APBR_analysis@yahoogroups.com>
          Sent: Thursday, November 27, 2003 3:38 PM

          >>I know that the maximum standard deviations for any team in any given
          >>year is square root (n-1) where n = the number of teams in the league
          >>that year, and I've factored that in.
          >>
          >
          >I'm not sure what you mean here. Standard deviations are a measure of
          >variation, which is a property of groups (like a league), not individual
          >teams. You can have a maximum possible SD of winning percentages for a
          >league (~ .500 I think for a 29 team league), but there is no maximum
          >possible SD for a team.

          Ed's answer is right on.

          >>But there's another factor, and I don't know how to resolve it.
          >>There's also a maximum SD possible for a team depending on how high or
          >>low the league standard deviation is. Um, right?
          >>
          >
          >I think I see what you mean now. Are you using "standard deviation" to mean
          >the result to your equation (win% - .500)/SD ? If so, the maximum possible

          I think again Ed is probably correctly interepreting what Monet is trying to
          do. The (win%-.500)/SD formula is known as a "standardized score" or "z-score".
          It provides a way to statistically mix apples and oranges by putting them all
          on a single, standardized scale, namely the number of standard deviations above
          or below the mean. It would indeed be a pretty good way of looking at teams'
          performances over the years -- simply looking at simple win percentages would
          be pretty good as well, but some people might be concerned that a .600 record
          may be more meaningful in a competitive, low standard deviation league (the
          NBA in 1977 e.g., where no team won more than 53 games nor won fewer than 22)
          than in a league with several teams around 60 wins.

          >result in an 82-game schedule is a little more than 5. (That number comes
          >from a 29 team league in which one team has 82 wins, 15 teams have 40 wins,
          >and 13 teams have 39 wins. That is, a league with a low SD and one team with
          >an extreme win%.)

          Again a very good answer, that's the way to get the maximum number of standard
          deviations above the mean. BTW, the Nobel prize-winning economist Paul
          Samuelson once published an article with a title something like "How Many
          Standard Deviations Above the Mean Can You Be?" which worked through that
          and other similar calculations. The rumor was that Samuelson was inspired
          to write the article because he was fond of dismissing inferior thinkers by
          proclaiming that they had IQs which were "a million standard deviations
          below the mean". Eventually he wondered if, even with x billion human
          beings, it was mathematically possible to be one million standard deviations
          below the mean.


          --MKT


          P.S. On a different note, I got back from vacation to find two delightful
          deliveries at my door: JohnH's _Basketball Prospectus_ and DeanO's
          _Basketball on Paper_. Due to returning late Monday night, I absolutely
          did not and still do not have time to really look at them, but DeanO's
          chapter on Cummings, Kemp, and Sikma was one I couldn't resist reading
          (I think it's fair to say the chapter was very much inspired by our
          APBR discussions of these players, especially Cummings). I also wanted
          to see how he used the defensive scoresheets from his WNBA Defensive
          Scoresheet project; there's been some interesting work by Michael
          Humphreys which he's partially revealed at baseballprimer.com on
          defense by baseball players and it could be an interesting contest to
          see which sport is able to make more progress in shedding light into
          one of their biggest black holes: evaluating defense by individual
          players. Baseball of course does have zone ratings and other related
          detailed observational measures ... but DeanO has defensive
          scoresheets, at least for one WNBA season anyway.
        • Mike G
          ... reading ... Well, I hope he dedicated the chapter to moi ! Sikma hasn t been hammered (as a subject) like the other 2. Maybe he s due. Without the book, I
          Message 4 of 5 , Dec 3, 2003
          • 0 Attachment
            --- In APBR_analysis@yahoogroups.com, "Michael Tamada" <tamada@o...>
            wrote:
            > ...DeanO's
            > chapter on Cummings, Kemp, and Sikma was one I couldn't resist
            reading
            > (I think it's fair to say the chapter was very much inspired by our
            > APBR discussions of these players, especially Cummings)...

            Well, I hope he dedicated the chapter to moi !

            Sikma hasn't been hammered (as a subject) like the other 2. Maybe
            he's due.

            Without the book, I can only guess DeanO doesn't like Kemp's
            turnovers nor Cummings' so-so shooting %. Hey, Kemp got plenty of
            MVP votes in spite of his middling minutes.

            My own efforts have been to give credit where it's due, and
            popularity be damned. It's more gratifying to recognize and
            acknowledge, than to join a chorus of yeas or nays.

            It's worth noting that in a couple of years, DeanO may have come to
            recognize that backward-looking analysis is actually fun and
            interesting to many people. It doesn't predict anything useful,
            perhaps. Maybe it helps sell books.
          • Dean Oliver
            ... You are certainly thanked for putting out lists that people hammer on. ... You should probably read it. It s a lot easier to see where my take comes from
            Message 5 of 5 , Dec 3, 2003
            • 0 Attachment
              --- In APBR_analysis@yahoogroups.com, "Mike G" <msg_53@h...> wrote:
              > --- In APBR_analysis@yahoogroups.com, "Michael Tamada" <tamada@o...>
              > wrote:
              > > ...DeanO's
              > > chapter on Cummings, Kemp, and Sikma was one I couldn't resist
              > reading
              > > (I think it's fair to say the chapter was very much inspired by our
              > > APBR discussions of these players, especially Cummings)...
              >
              > Well, I hope he dedicated the chapter to moi !

              You are certainly thanked for putting out lists that people hammer on.

              >
              > Sikma hasn't been hammered (as a subject) like the other 2. Maybe
              > he's due.
              >
              > Without the book, I can only guess DeanO doesn't like Kemp's
              > turnovers nor Cummings' so-so shooting %. Hey, Kemp got plenty of
              > MVP votes in spite of his middling minutes.

              You should probably read it. It's a lot easier to see where my take
              comes from with the full context. At least add it to you Christmas
              list...

              >
              > My own efforts have been to give credit where it's due, and
              > popularity be damned. It's more gratifying to recognize and
              > acknowledge, than to join a chorus of yeas or nays.
              >
              > It's worth noting that in a couple of years, DeanO may have come to
              > recognize that backward-looking analysis is actually fun and
              > interesting to many people. It doesn't predict anything useful,
              > perhaps. Maybe it helps sell books.

              In the book, you'll see why I looked at these players. Specifically,
              I looked at different classes of players to understand how large a
              contribution they could make to teams. I _know_ that Kalb's book on
              the 50 Greatest is going to sell a helluva lot more than mine because
              his focus is on who is better than who (pub debate material), whereas
              mine is how to build a better team (front office debate material).
              When I look back, I do so to understand how best to go forward, not
              just to discuss who's better. In doing that, identifying the types of
              players that add wins and losses in various amounts is quite useful.
              But rank 'em? I provide wins and losses (which I think is better than
              Bill James' win shares), you can rank how you like.

              DeanO
            Your message has been successfully submitted and would be delivered to recipients shortly.