Sorry, an error occurred while loading the content.

## The Problem with Possessions-Based Linear Weights

Expand Messages
• Please pardon me for this argument that I am going to make. I have not been at this as long as most of you have, so I apologize if I step on any toes. This
Message 1 of 42 , Mar 25, 2004
• 0 Attachment
Please pardon me for this argument that I am going to make. I have
not been at this as long as most of you have, so I apologize if I
step on any toes. This is a pretty long post, and hopefully folks
will find it of some value.

The linear weights approaches generally come in three forms. The
first, such as the NBA's efficiency statistic, simple counts up the
good things and subtracts the bad things. Since +1 and -1 are not
the only possible linear weights one could choose to use, there are
not many that defend such a weighting scheme.

The second approach is what I will call the possessions-based
approach. The essence of this approach is to count every
contribution to either points scored or a failed possession and to
count it only once. This is certainly the approach used to
construct John Hollinger's PER and its lies behind the construction
of Dean Oliver's offensive and defensive ratings. Also, a large
fraction of the arguments on this board are about the proper way to
do this possessions-based accounting.

So what is wrong with this approach? The problem is that there are
numerous contributions to successful or failed possessions for which
there are no statistics - a good pick, an ineffective blockout, a
good entry pass that leads to a score but not an assist, the
presence of a shot blocker that keeps his opponents from driving to
the hoop. One could easily argue that the unmeasured contributions
to successful or failed possessions are more than the measured
contributions, e.g. points, assists, steals, etc.

This is one place where basketball statistics diverge from baseball
statistics. In baseball, the unmeasured contributions to a run
scored are a much smaller proportion of the total contributions.
Things like good baserunning, the effect of a base-stealing threat
on the pitches a batter sees, the effect a strong-armed right
fielder has on a batter in a sacrifice fly situation are difficult
to measure, at best. But these things are far less important in
measuring run production that the unmeasured contributions in
basketball are in measuring point production.

This possessions-based point production approach is one of the
concepts we have borrowed from baseball and Bill James. And IMO I
don't think it fits as well in basketball. So many things are
unmeasured in basketball that when coming up with a linear weight to
put in front of blocks, we don't want to only account for the missed
shot it produces (and its appropriate probability of becoming a
defensive rebound). We also want that weight to reflect the lower
field goal percentage for those guys who arch the ball higher when
coming in the lane or who don't come in the lane for fear that their
shot will be blocked. On the other hand, we want it to reflect that
guys who block shots may be more susceptible to pump fakes.

Offensive rebounds without question help a team, but that is not the
only consideration when trying to figure out what linear weight to
put in front of offensive rebounds. What about if guys who get lots
of offensive rebounds tend to be non-factors on offense who clog the
lane and make it harder for their teammates to score? If that is
the case, it is possible that the proper linear weight for an
offensive rebound could even be negative.

Similar arguments could be made for negative linear weights on
steals or positive linear weights on free throw attempts or three
point attempts or personal fouls. The point here is that the proper
linear weights boil down to be being an empirical question and not a
matter of logic. The question is how can we estimate the proper
linear weights?

(And this ignores the whole issue of the proper functional form, but
we will leave that for another day.)

This gets us to the third approach, using some type of estimation
approach to estimate the proper linear weights. This is the
approach that David Berri used to estimate linear weights. He ran
regressions using team-level data and then applied the weights
estimated from that data to individual players. This is another
technique borrowed from baseball, and IMO another case where an
approach that works well in baseball does not work very well in
basketball.

At the team-level, the benefit of an assisted basket is fully
subsumed in a better field goal percentage. The issue of how much
of a field goal to attribute to the assist just doesn't come up.
The benefits of having players who use a lot of possessions fairly
efficiently allowing their teammates to stick to more high
percentage field goal attempts also does not come up. Ironically,
much of what counts as team play is ignored using this team-level
approach to estimating linear weights. (And this ignores other
complications that this approach entails.) All told, these problems
are pretty severe and I would not be surprised if the results of
this approach were worse than those for even something like the
NBA's efficiency statistic.

So this gets me to the approach that I outlined in my WINVAL stuff.

http://www.uncg.edu/bae/people/rosenbaum/NBA/winval1.htm

The approach basically estimates plus/minus ratings adjusted for
home court advantage and then other players sharing the floor with
the given player. Then I regress these adjusted plus/minus ratings
on various statistics (adjusted for pace).

Here is what I get. (The estimated linear weights are in the
PARAMETER ESTIMATE column.)

PTSP40 = points scored per 40 minutes
FG2AP40 = two point field goals attempted per 40 minutes
TAP40 = three pointers attempted per 40 minutes
FTAP40 = free throws attempted per 40 minutes
ASP40 = assists per 40 minutes
ORP40 = offensive rebounds per 40 minutes
DRP40 = defensive rebounds per 40 minutes
TOP40 = turnovers per 40 minutes
STP40 = steals per 40 minutes
BKP40 = blocks per 40 minutes
PFP40 = personal fouls per 40 minutes

Root MSE 183.20337 R-square 0.4407
Dep Mean 4.88730 Adj R-sq 0.4201
C.V. 3748.56273

Parameter Estimates

Parameter Standard T for H0:
Variable DF Estimate Error Parameter=0 Prob > |T|

INTERCEP 1 -11.792794 2.20556250 -5.347 0.0001
PTSP40 1 0.935041 0.25573422 3.656 0.0003
FG2AP40 1 -0.726939 0.25623989 -2.837 0.0049
TAP40 1 -0.128389 0.33902488 -0.379 0.7052
FTAP40 1 0.104796 0.32231296 0.325 0.7453
ASP40 1 0.795411 0.19939588 3.989 0.0001
ORP40 1 0.458335 0.43008193 1.066 0.2874
DRP40 1 0.570479 0.22121421 2.579 0.0104
TOP40 1 -0.717767 0.56808664 -1.263 0.2074
STP40 1 2.482385 0.57789004 4.296 0.0001
BKP40 1 2.021543 0.41889403 4.826 0.0001
PFP40 1 -0.004383 0.32491006 -0.013 0.9892

As you can tell, these estimates differ in a lot of ways from the
linear weights commonly used. Steals and blocks are much more
heavily weighted. Rebounds are weighted less. Free throw attempts
seem to have the wrong sign, and personal fouls almost do.

But again the approach here is different. The question I am asking
here is not whether when a guy is at line and misses it, should it
count against him? (In this regression it does not.) The question
is do players who get to the foul line and miss tend, holding the
other statistics constant, tend to result in that team's players
score more than its opponents. That is a very different question
than the possessions-based approach takes.

Now this approach is not the be all and end all, and I think there
are hybrids of the two approaches that may be better. For example,
the possessions-based statistics may be better to put in this
regression that what I currently have in there. And there are
things that focusing on little chunks of time misses, such as fouls
generated that help in later chunks of the game.

But the bigger point is this. Given the large fraction of
unmeasured contributions in basketball, IMO linear weights are
really an empirical questions and logic can only get us so far.

And with that, the prosecution rests. :)
• I think wimpds has a point here. If a particular player with limited offensive skills tends to play at times when his teammates shoot more poorly (because they
Message 42 of 42 , Mar 30, 2004
• 0 Attachment
I think wimpds has a point here.

If a particular player with limited offensive skills tends to play
at times when his teammates shoot more poorly (because they are in
essence playing 4 on 5) than their WINVAL ratings suggest they
should, that player would end up with more offensive rebounding
opportunities than a typical player.

But let's suppose a guy is on the floor for 80 possessions in 40
minutes and he drops his team's shooting percentage down 5
percentage points (a pretty big drop). I would assume that would
result in about three to five extra offensive rebounding
opportunities for this player and if he collects 10 percent of those
extra rebounds, this amounts to 0.3 to 0.5 extra offensive rebounds
per 40 minutes. Not trivial, but also not a huge difference.

--- In APBR_analysis@yahoogroups.com, "wimpds" <wimpds@y...> wrote:
> But if teammates are more likely to miss shots when they play with
you
> than when they play with other players, that will depress your
WinVal
> rating, right? If this attribute is correlated with individual
> offensive rebounding, won't it have a negative effect on the value
of
> the coefficent?
>
> On the defensive side, we only have a few variables to contribute
> significantly to your WinVal rating. [I'd be interested in seeing
> regressions run separately on the offensive and defensive WinVal
> ratings.] Defensive Rebounds coefficient might be picking up
> something more than simply rebounding ability, it could be getting
at
> defense somewhat more generally.
>
>
>
> --- In APBR_analysis@yahoogroups.com, "Michael Tamada"
<tamada@o...>
> wrote:
> > It's a good thought, although the regression should already
> > account for this by already measuring the impact of
> > opponents missing shots and one's team missing shots.
> >
> >
> > --MKT
> >
> >
> > -----Original Message-----
> > From: wimpds [mailto:wimpds@y...]
> > Sent: Monday, March 29, 2004 3:13 PM
> > To: APBR_analysis@yahoogroups.com
> > Subject: [APBR_analysis] Re: The Problem with Possessions-Based
Linear
> > Weights
> >
> >
> > Another thought on this subject ( though I may not be thinking
through
> > the regression carefully enough) is that offensive rebounds are
> > probably correlated with having teammates missing shots while
> > defensive rebounds are correlated with the opponent missing
shots.
> > This effect would seem to be even stronger if your teammates are
more
> > likely to miss because you're not attracting any defensive
attention.
> >
> >
> >
> >
> >
> > --- In APBR_analysis@yahoogroups.com, "carlos12155"
> > <carlosmanuel@b...> wrote:
> > > --- In APBR_analysis@yahoogroups.com, "Mike G" <msg_53@h...>
wrote:
> > > > --- In APBR_analysis@yahoogroups.com, "Kevin Pelton"
> > > > <kpelton08@h...> wrote:
> > > > >
> > > > > Rodman specialized in rebounding -- not offensive
rebounding. He
> > > > was
> > > > > great at both ends. I don't know if I can think of someone
who
> > > > > specializes specifically in offensive rebounding.
> > > >
> > > > The original poster meant, I'm sure, that on Offense, Rodman
did
> > > > little more than rebound missed shots, at least by the time
in his
> > > > career when he was with the Bulls.
> > > >
> > > > Obviously, he went to the other end of the court on defense,
and
> did
> > > > other things there.
> > > >
> > >
> > > That was exactly my intention. It seemed to me that if
offensive
> > > rebounds are at some level a negative, Rodman performance
would be the
> > > perfect example; after all that was the only thing he did on
offense.
> >
> >
> >
> >
> > Yahoo! Groups Links
Your message has been successfully submitted and would be delivered to recipients shortly.