## 2777RE: [APBR_analysis] Maximum standard deviations (math help needed)

Expand Messages
• Dec 3, 2003
• 0 Attachment
-----Original Message-----
From: igor eduardo küpfer [mailto:igorkupfer@...]
Sent: Friday, November 28, 2003 9:13 PM

----- Original Message -----
>From: "monepeterson" <mone@...>
To: <APBR_analysis@yahoogroups.com>
Sent: Thursday, November 27, 2003 3:38 PM

>>I know that the maximum standard deviations for any team in any given
>>year is square root (n-1) where n = the number of teams in the league
>>that year, and I've factored that in.
>>
>
>I'm not sure what you mean here. Standard deviations are a measure of
>variation, which is a property of groups (like a league), not individual
>teams. You can have a maximum possible SD of winning percentages for a
>league (~ .500 I think for a 29 team league), but there is no maximum
>possible SD for a team.

>>But there's another factor, and I don't know how to resolve it.
>>There's also a maximum SD possible for a team depending on how high or
>>low the league standard deviation is. Um, right?
>>
>
>I think I see what you mean now. Are you using "standard deviation" to mean
>the result to your equation (win% - .500)/SD ? If so, the maximum possible

I think again Ed is probably correctly interepreting what Monet is trying to
do. The (win%-.500)/SD formula is known as a "standardized score" or "z-score".
It provides a way to statistically mix apples and oranges by putting them all
on a single, standardized scale, namely the number of standard deviations above
or below the mean. It would indeed be a pretty good way of looking at teams'
performances over the years -- simply looking at simple win percentages would
be pretty good as well, but some people might be concerned that a .600 record
may be more meaningful in a competitive, low standard deviation league (the
NBA in 1977 e.g., where no team won more than 53 games nor won fewer than 22)
than in a league with several teams around 60 wins.

>result in an 82-game schedule is a little more than 5. (That number comes
>from a 29 team league in which one team has 82 wins, 15 teams have 40 wins,
>and 13 teams have 39 wins. That is, a league with a low SD and one team with
>an extreme win%.)

Again a very good answer, that's the way to get the maximum number of standard
deviations above the mean. BTW, the Nobel prize-winning economist Paul
Samuelson once published an article with a title something like "How Many
Standard Deviations Above the Mean Can You Be?" which worked through that
and other similar calculations. The rumor was that Samuelson was inspired
to write the article because he was fond of dismissing inferior thinkers by
proclaiming that they had IQs which were "a million standard deviations
below the mean". Eventually he wondered if, even with x billion human
beings, it was mathematically possible to be one million standard deviations
below the mean.

--MKT

P.S. On a different note, I got back from vacation to find two delightful
deliveries at my door: JohnH's _Basketball Prospectus_ and DeanO's
_Basketball on Paper_. Due to returning late Monday night, I absolutely
did not and still do not have time to really look at them, but DeanO's
chapter on Cummings, Kemp, and Sikma was one I couldn't resist reading
(I think it's fair to say the chapter was very much inspired by our
APBR discussions of these players, especially Cummings). I also wanted
to see how he used the defensive scoresheets from his WNBA Defensive
Scoresheet project; there's been some interesting work by Michael
Humphreys which he's partially revealed at baseballprimer.com on
defense by baseball players and it could be an interesting contest to
see which sport is able to make more progress in shedding light into
one of their biggest black holes: evaluating defense by individual
players. Baseball of course does have zone ratings and other related
detailed observational measures ... but DeanO has defensive
scoresheets, at least for one WNBA season anyway.
• Show all 5 messages in this topic