Re: [APBR_analysis] Efficiency vs productivity
- Interesting stuff, although I don't completely understand the methodology.
On Fri, 29 Mar 2002, Dean Oliver wrote:
> One of the things I've done fairly recently is to look at how efficient
> players are as a function of how frequently they try to score. Since I do
> have estimates of how efficient players are that I trust, I have always
> felt that some of these efficient guys can increase their scoring rate,
> while others can't. I've taken a stab at looking at this by looking at
> boxscores and the "possession rate" of players. Basically, my hypothesis
> was that players are less efficient -- have lower offensive ratings -- as
> they have higher possession rates (possessions/minute). I've looked for
A good hypothesis, but only if qualified in certain ways. Such as
"ceteris parabus": holding all other things equal, if we ask one player
to do more things per minute, his efficiency is likely to decrease.
Although the graphs have efficiency on the horizontal axis, which seems
hard to justify...we can try to say that high efficiency nights put a cap
on a player's possessions per minute, but in reality there is no such cap.
We could shove the ball in his hands a few more times. That would
probably lead to more missed shots or turnovers, and lower efficiency, but
that simply shows that efficiency is not really the independent variable
here, it seems to me it should be the dependent variable.
I don't know if the hypothesis works as well if we look at all of a
player's games. There could be some causality that goes in the reverse
direction and leads to a reverse relationship: on nights when a player
is doing well (and I don't think we have to invoke potentially mythical
hot streaks to justify the existence of such games, it could simply be a
mistmatch that night), he is probably more likely to get fed the ball by
his teammates and will have more possessions. So on those hot nights
compared to regular nights, the relationship would be high effic --> high
possessions, instead of high possesions --> low effic.
> this a few ways. First, I looked to see whether there were "critical
> points" of possession rate, where players are statistically significantly
> better at lower possession rates than they are at higher possession
> rates. Second, I looked at moving averages (of every 10th percentile) to
> see whether there were general trends. Finally, I summarized the results
This is the part where I'm trying to follow. From the graphs, it looks
like you're doing calculations, or at least plots, at offensive ratings
values of 120, 115, 110, 105, etc. Are those the percentiles that you
speak of? What are you calculating a moving average of -- a moving
average of possessions per minute? and calculated over what set of
observations -- all of the preceding ones?
Are the critical points part of the graphs you drew up, or is that a
> in a plot of possession rate vs. observed offensive rating. Those seem to
The plots and their labels are not quite clear. Are you looking at all
the games in which a player achieved, say efficiency of 120 or better, and
calculating the possessions per minute of those games, compared to this
games of <= 119 efficiency, and if the difference is significant plotting
the max possessions per minute among those games? I would expect an
occasional non-downward-sloping line segment, just from random variations,
even though the overall hypothesis of a downward-sloping could be (and
evidently is) correct.
Also, the interpretation above doesn't explain why most of the graphs dip
down into 0 possessions per minute, with efficiencies that are positive
(and large) rather than undefined. I would think that 0 possesions per
minute would require extrapolation, since it would almost never be
observed in any actual game, except for the occasional "trillion" game
where a player plays 1 minute and has a box score line that goes 1 0-0 0-0
0-0 0 0 0 etc.
So I'm not understanding what's behind these graphs.
The notion of looking at maxima brings up a possible statistical technique
to use: econometricians often estimate what is called a frontier
regression line, one which basically looks at the outer hull of a set of
data points -- possibly required to be convex, and usually with a
couple of random error terms assumed to exist, so that a player can
ocasionally perform at better than 100% of his long run potential, due to
the occasional random good night. I.e. it's not literally the outer hull,
just outside "most" of the point.
> be doing fairly well in matching up with observed season possession rates
> and observed offensive ratings. So I'm hoping to predict players who can
Aggregate statistics such as season stats don't always show the same
relationship as the underlying individual game stats. E.g. John Stockton
and Rashard Lewis probably had different, and lower, graphs during their
rookie seasons than they did later in their careers (Stockton shot a
career low 47%, Lewis scored fewer than 16 points per 48 minutes). Once a
player's career reaches maturity, then one could probably compare
different seasons -- although one could imagine changes in the quality of
the player's team and role could affect what his season stats were like.
> increase their productivity without significant loss of efficiency.
> A few plots are attached that show the relationships I'm seeing (based on
> last 2 years of data). Some players haven't shown any ability to maintain
> high possession rates. Some are about the same efficiency no matter what
> rate. Some show steep drop-offs, others more gradual. The plots just take
> a little bit of looking at. I've selected the players somewhat at
> random. Rashard Lewis, Ben Wallace, Jerry Stackhouse, Allen Iverson, Aaron
> McKie, Jason Terry, Antawn Jamison, Reggie Miller, Tim Duncan, David
> Robinson, Vince Carter, Kobe, Shaq, Derek Fisher, Rick Fox, Robert Horry,
> Michael Finley, Steve Nash, Dirk Nowitzki. That's enough for now.
I would've like to have seen a few more non-superstar non-scorer types in
the mix...e.g. Ben Wallace, whose graph wasn't in the set of attachments
that I received. Some of them high efficiency types such as Todd
McCulloch -- how fast would his efficiency plunged if he was given more
possessions? And for comparison, some players who are just plain bad.
Putting several of the plotted lines in a single graph would make it
easier to compare players.
It was surprising that Kobe's max possessions didn't max out at
particularly high numbers per minute. I would've thought that Shaq's been
absent enough for there to be a good number of Shaq-less and presumably
high possession nights for Kobe. Or ... was Kobe enjoying high efficiency
on those Shaq-less nights, which would cause them to not show up on the
plots (no statistically significant drop off in possessions per minute)?
If so, those games might not show up on the graphs but Kobe's high
possessions and high efficiency on those nights would be important
> I'm not sure how to summarize all of this in quick numbers, but I'm pretty
> pleased with where it's going and what it's saying.
It's comforting that the graphs show such strong consistent downward
slopes but I'd want to know more about the methodology before drawing