Sorry, an error occurred while loading the content.

## Maximum standard deviations (math help needed)

Expand Messages
• First of all, please excuse the inarticulate nature with which I m about to explain my problem: I m tracking team performances from year to year in the NBA and
Message 1 of 5 , Nov 27, 2003
First of all, please excuse the inarticulate nature with which I'm
about to explain my problem:

I'm tracking team performances from year to year in the NBA and NHL.

The methods I'm using are simple but have some problems. The first is
winning percentage minus the mean (.500) divided by the standard
deviation for winning percentages that year. The second is point (or
goal) differential divided by the standard deviation for point
differential that year. Rob Neyer and Eddie Epstein did this for their
"Baseball Dynasties" book, although they made the mistake of doing
seperate deviations for runs scored and runs allowed and then adding
them together.

I know that the maximum standard deviations for any team in any given
year is square root (n-1) where n = the number of teams in the league
that year, and I've factored that in.

But there's another factor, and I don't know how to resolve it.
There's also a maximum SD possible for a team depending on how high or
low the league standard deviation is. Um, right?

For winning percentage, I think it's one divided by the league SD. Is
that right?

But for linear numbers like point and goal differential, I have no
clue where to start. In 1976, the standard deviation for point
differential in the league was historically low, making the Golden
State Warriors of that year look like one of the best teams of all
time. While the Warriors were really good, they have no control over
how even the rest of the league is, so I'm trying to account for that.
Is there a way to account for that? Preferably in the form of a
formula I can toss into Excel?

Hope the question is clear.

Moné
• ... From: monepeterson To: Sent: Thursday, November 27, 2003 3:38 PM Subject: [APBR_analysis] Maximum
Message 2 of 5 , Nov 28, 2003
----- Original Message -----
From: "monepeterson" <mone@...>
To: <APBR_analysis@yahoogroups.com>
Sent: Thursday, November 27, 2003 3:38 PM
Subject: [APBR_analysis] Maximum standard deviations (math help needed)
>
>First of all, please excuse the inarticulate nature with which I'm
>about to explain my problem:
>
>I'm tracking team performances from year to year in the NBA and NHL.
>
>The methods I'm using are simple but have some problems. The first is
>winning percentage minus the mean (.500) divided by the standard
>deviation for winning percentages that year. The second is point (or
>goal) differential divided by the standard deviation for point
>differential that year. Rob Neyer and Eddie Epstein did this for their
>"Baseball Dynasties" book, although they made the mistake of doing
>seperate deviations for runs scored and runs allowed and then adding
>them together.
>

Point differential is directly related to winning percentages, a la the
Pythagorean method. Since the Pyth is scaled to the overall points scored
environment, you'd probably be better off using that instead of pts diff.

>I know that the maximum standard deviations for any team in any given
>year is square root (n-1) where n = the number of teams in the league
>that year, and I've factored that in.
>

I'm not sure what you mean here. Standard deviations are a measure of
variation, which is a property of groups (like a league), not individual
teams. You can have a maximum possible SD of winning percentages for a
league (~ .500 I think for a 29 team league), but there is no maximum
possible SD for a team.

>But there's another factor, and I don't know how to resolve it.
>There's also a maximum SD possible for a team depending on how high or
>low the league standard deviation is. Um, right?
>

I think I see what you mean now. Are you using "standard deviation" to mean
the result to your equation (win% - .500)/SD ? If so, the maximum possible
result in an 82-game schedule is a little more than 5. (That number comes
from a 29 team league in which one team has 82 wins, 15 teams have 40 wins,
and 13 teams have 39 wins. That is, a league with a low SD and one team with
an extreme win%.)

>For winning percentage, I think it's one divided by the league SD. Is
>that right?
>
>But for linear numbers like point and goal differential, I have no
>clue where to start. In 1976, the standard deviation for point
>differential in the league was historically low, making the Golden
>State Warriors of that year look like one of the best teams of all
>time. While the Warriors were really good, they have no control over
>how even the rest of the league is, so I'm trying to account for that.
>Is there a way to account for that? Preferably in the form of a
>formula I can toss into Excel?
>
>Hope the question is clear.
>
>Moné
>

Perhaps you can tell us why you need to know the maximum SD?

ed
• ... From: igor eduardo küpfer [mailto:igorkupfer@rogers.com] Sent: Friday, November 28, 2003 9:13 PM ... To: Sent: Thursday,
Message 3 of 5 , Dec 3, 2003
-----Original Message-----
From: igor eduardo küpfer [mailto:igorkupfer@...]
Sent: Friday, November 28, 2003 9:13 PM

----- Original Message -----
>From: "monepeterson" <mone@...>
To: <APBR_analysis@yahoogroups.com>
Sent: Thursday, November 27, 2003 3:38 PM

>>I know that the maximum standard deviations for any team in any given
>>year is square root (n-1) where n = the number of teams in the league
>>that year, and I've factored that in.
>>
>
>I'm not sure what you mean here. Standard deviations are a measure of
>variation, which is a property of groups (like a league), not individual
>teams. You can have a maximum possible SD of winning percentages for a
>league (~ .500 I think for a 29 team league), but there is no maximum
>possible SD for a team.

Ed's answer is right on.

>>But there's another factor, and I don't know how to resolve it.
>>There's also a maximum SD possible for a team depending on how high or
>>low the league standard deviation is. Um, right?
>>
>
>I think I see what you mean now. Are you using "standard deviation" to mean
>the result to your equation (win% - .500)/SD ? If so, the maximum possible

I think again Ed is probably correctly interepreting what Monet is trying to
do. The (win%-.500)/SD formula is known as a "standardized score" or "z-score".
It provides a way to statistically mix apples and oranges by putting them all
on a single, standardized scale, namely the number of standard deviations above
or below the mean. It would indeed be a pretty good way of looking at teams'
performances over the years -- simply looking at simple win percentages would
be pretty good as well, but some people might be concerned that a .600 record
may be more meaningful in a competitive, low standard deviation league (the
NBA in 1977 e.g., where no team won more than 53 games nor won fewer than 22)
than in a league with several teams around 60 wins.

>result in an 82-game schedule is a little more than 5. (That number comes
>from a 29 team league in which one team has 82 wins, 15 teams have 40 wins,
>and 13 teams have 39 wins. That is, a league with a low SD and one team with
>an extreme win%.)

Again a very good answer, that's the way to get the maximum number of standard
deviations above the mean. BTW, the Nobel prize-winning economist Paul
Samuelson once published an article with a title something like "How Many
Standard Deviations Above the Mean Can You Be?" which worked through that
and other similar calculations. The rumor was that Samuelson was inspired
to write the article because he was fond of dismissing inferior thinkers by
proclaiming that they had IQs which were "a million standard deviations
below the mean". Eventually he wondered if, even with x billion human
beings, it was mathematically possible to be one million standard deviations
below the mean.

--MKT

P.S. On a different note, I got back from vacation to find two delightful
deliveries at my door: JohnH's _Basketball Prospectus_ and DeanO's
_Basketball on Paper_. Due to returning late Monday night, I absolutely
did not and still do not have time to really look at them, but DeanO's
chapter on Cummings, Kemp, and Sikma was one I couldn't resist reading
(I think it's fair to say the chapter was very much inspired by our
APBR discussions of these players, especially Cummings). I also wanted
to see how he used the defensive scoresheets from his WNBA Defensive
Scoresheet project; there's been some interesting work by Michael
Humphreys which he's partially revealed at baseballprimer.com on
defense by baseball players and it could be an interesting contest to
see which sport is able to make more progress in shedding light into
one of their biggest black holes: evaluating defense by individual
players. Baseball of course does have zone ratings and other related
detailed observational measures ... but DeanO has defensive
scoresheets, at least for one WNBA season anyway.
• ... reading ... Well, I hope he dedicated the chapter to moi ! Sikma hasn t been hammered (as a subject) like the other 2. Maybe he s due. Without the book, I
Message 4 of 5 , Dec 3, 2003
wrote:
> ...DeanO's
> chapter on Cummings, Kemp, and Sikma was one I couldn't resist
> (I think it's fair to say the chapter was very much inspired by our
> APBR discussions of these players, especially Cummings)...

Well, I hope he dedicated the chapter to moi !

Sikma hasn't been hammered (as a subject) like the other 2. Maybe
he's due.

Without the book, I can only guess DeanO doesn't like Kemp's
turnovers nor Cummings' so-so shooting %. Hey, Kemp got plenty of
MVP votes in spite of his middling minutes.

My own efforts have been to give credit where it's due, and
popularity be damned. It's more gratifying to recognize and
acknowledge, than to join a chorus of yeas or nays.

It's worth noting that in a couple of years, DeanO may have come to
recognize that backward-looking analysis is actually fun and
interesting to many people. It doesn't predict anything useful,
perhaps. Maybe it helps sell books.
• ... You are certainly thanked for putting out lists that people hammer on. ... You should probably read it. It s a lot easier to see where my take comes from
Message 5 of 5 , Dec 3, 2003
--- In APBR_analysis@yahoogroups.com, "Mike G" <msg_53@h...> wrote:
> --- In APBR_analysis@yahoogroups.com, "Michael Tamada" <tamada@o...>
> wrote:
> > ...DeanO's
> > chapter on Cummings, Kemp, and Sikma was one I couldn't resist
> > (I think it's fair to say the chapter was very much inspired by our
> > APBR discussions of these players, especially Cummings)...
>
> Well, I hope he dedicated the chapter to moi !

You are certainly thanked for putting out lists that people hammer on.

>
> Sikma hasn't been hammered (as a subject) like the other 2. Maybe
> he's due.
>
> Without the book, I can only guess DeanO doesn't like Kemp's
> turnovers nor Cummings' so-so shooting %. Hey, Kemp got plenty of
> MVP votes in spite of his middling minutes.

You should probably read it. It's a lot easier to see where my take
comes from with the full context. At least add it to you Christmas
list...

>
> My own efforts have been to give credit where it's due, and
> popularity be damned. It's more gratifying to recognize and
> acknowledge, than to join a chorus of yeas or nays.
>
> It's worth noting that in a couple of years, DeanO may have come to
> recognize that backward-looking analysis is actually fun and
> interesting to many people. It doesn't predict anything useful,
> perhaps. Maybe it helps sell books.

In the book, you'll see why I looked at these players. Specifically,
I looked at different classes of players to understand how large a
contribution they could make to teams. I _know_ that Kalb's book on
the 50 Greatest is going to sell a helluva lot more than mine because
his focus is on who is better than who (pub debate material), whereas
mine is how to build a better team (front office debate material).
When I look back, I do so to understand how best to go forward, not
just to discuss who's better. In doing that, identifying the types of
players that add wins and losses in various amounts is quite useful.
But rank 'em? I provide wins and losses (which I think is better than
Bill James' win shares), you can rank how you like.

DeanO
Your message has been successfully submitted and would be delivered to recipients shortly.