- Feb 27, 2004Thanks Dan, very interesting, and confirms a lot of my own suspicions

about WINVAL.

On the 3-point attempts being less costly, I think one thing you have

to consider is free throws. In other words (as I casually make an

assumption which may not be true), if you're not treating a two-point

attempt where the shooter is fouled as a two-point attempt, then it's

bound to lower the expected value of a two-point shot. Since hardly

anybody gets fouled on a 3, it makes the 3 look better by comparison.

--- In APBR_analysis@yahoogroups.com, "dan_t_rosenbaum"

<rosenbaum@u...> wrote:> There are several issues to consider about these adjusted

plus/minus

> statistics, but one I would hope not to spend forever on is whether

doing.

> or not I am exactly replicating what Winston and Sagarin are

> I am sure that I am not, but other than the fact that they have

more

> years of data at their disposal (which in this case is an important

in

> difference), I highly doubt what they are doing is using the data

> a significantly more efficient manner than I am. What they are

general

> doing is surely different (and the results are probably quite a bit

> different due to the noisiness of this methodology), but the

> theme of my work has to be very close to theirs.

times

>

> So let me talk about a few issues related to these adjusted

> plus/minus statistics.

>

> 1. The single most important feature of these results is how noisy

> the estimates are. Relative to my Tendex-like index, the standard

> errors for these adjusted plus/minus statistics are 3.5 to 5.5

> larger. What that means is that the precision in these adjusted

will

> plus/minus statistics in a whole season is about equivalent to what

> you get with a Tendex-like index in three to seven games. That is

> why DeanO says in his book that they don't pass the laugh test;

> these estimates are really, really noisy.

>

> I should add, however, that another season or two worth of data

> help more than adding an equivalent number of games for a Tendex-

of

> like index, because a new season will bring lots of new player

> combinations, which will help break up the very strong

> multicollinearity that sharply reduces the variation that can

> identify the value of these hundreds of players. There are issues

> in making use of more than one year of this type of data, but I

> suspect it will help things a lot. But at the end of the day, I

> suspect it will still be a lot more noisy than something like a

> Tendex index.

>

> 2. So is the conclusion that something like WINVAL is completely

> useless? No. The great advantage of this approach is that IMO it

> is the least biased (in the strict statistical sense) methodology

> gauging player value of any methodology that I have seen proposed.

player

> Unlike other methods that we know leave important features of

> value, such as defense and a lot of non-assist passing, this method

The

> in theory captures much closer to everything that is relevant.

> (There are still things it misses, but IMO those things are second

> order relative to the things other methods miss.)

>

> That said, being the least biased is only part of the equation.

> other part is how precisely can we estimate player value with this

that

> methodology? What is the variance of the estimates? Well, the

> upshot is that this methodology is tremendously noisy relative to

> other methods, which makes it very hard to use.

>

> For example, using 2002-03 data there are only about 50 players

> using this method, we can say with the usual level of certainty

used

> in statistics (a 5% rejection region) that those players are

enough

> significantly better than the average replacement player (players

> who played less than 513 minutes). On the other hand, using my

> Tendex-like index we can say that about nearly 200 players.

>

> 3. So how can these adjusted plus/minus statistics be used, given

> how noisy they are? With more data they might become precise

> to be used on their own, but I think the best way to use them is

how

> I used them in the regressions at the very end of my link.

regressions,

>

> http://www.uncg.edu/bae/people/rosenbaum/NBA/wv1.lst~

>

> Here I regressed these adjusted plus/minus statistics onto per 40

> minute statistics. The coefficients on these per 40 minute

> statistics are estimates of the weights that we should use in

> indexes that weight various statistics. Now I am not arguing that

> the linear weights should be pulled straight from these

> but I think we can learn some things from these regressions.

three

>

> a. The first lesson seems to be that specifications with attempts

> rather than misses seem to fit better. The second specification

> which uses two point field goal attempts, three point attempts, and

> free throw attempts has a higher R-squared that the first

> specification, even though it uses one less explanatory variable.

> (Perhaps David Berri was right about that.)

>

> b. It appears that rebounds have much less value than usually

> assumed (especially relative to what Berri assumed) and that steals

> and blocks are much more valuable. Perhaps these defensive

> statistics are highly correlated with other unmeasured defensive

> qualities.

>

> Also, even after accounting for points scored, it appears that

> point misses are far less costly than two point misses. Perhaps

point

> having three point shooters on the floor spreads the floor,

> resulting in fewer turnovers and higher field goal percentages for

> other players, i.e. things that are not picked up in the three

> players' own statistics.

as

>

> Note that I tested whether the cost of a two pointer was the same

> that of a three pointer and the p-value for the test was 0.0024,

> suggesting that three point attempts for whatever reason are less

> costly than two points, even after accounting for the points scored

> on those attempts.

>

> Well, that is probably enough for now, since I suspect other things

> will come up later. I hope this was interesting.

>

> Best wishes,

> Dan - << Previous post in topic Next post in topic >>