Loading ...
Sorry, an error occurred while loading the content.

The Problem with Possessions-Based Linear Weights

Expand Messages
  • dan_t_rosenbaum
    Please pardon me for this argument that I am going to make. I have not been at this as long as most of you have, so I apologize if I step on any toes. This
    Message 1 of 42 , Mar 25, 2004
      Please pardon me for this argument that I am going to make. I have
      not been at this as long as most of you have, so I apologize if I
      step on any toes. This is a pretty long post, and hopefully folks
      will find it of some value.

      The linear weights approaches generally come in three forms. The
      first, such as the NBA's efficiency statistic, simple counts up the
      good things and subtracts the bad things. Since +1 and -1 are not
      the only possible linear weights one could choose to use, there are
      not many that defend such a weighting scheme.

      The second approach is what I will call the possessions-based
      approach. The essence of this approach is to count every
      contribution to either points scored or a failed possession and to
      count it only once. This is certainly the approach used to
      construct John Hollinger's PER and its lies behind the construction
      of Dean Oliver's offensive and defensive ratings. Also, a large
      fraction of the arguments on this board are about the proper way to
      do this possessions-based accounting.

      So what is wrong with this approach? The problem is that there are
      numerous contributions to successful or failed possessions for which
      there are no statistics - a good pick, an ineffective blockout, a
      good entry pass that leads to a score but not an assist, the
      presence of a shot blocker that keeps his opponents from driving to
      the hoop. One could easily argue that the unmeasured contributions
      to successful or failed possessions are more than the measured
      contributions, e.g. points, assists, steals, etc.

      This is one place where basketball statistics diverge from baseball
      statistics. In baseball, the unmeasured contributions to a run
      scored are a much smaller proportion of the total contributions.
      Things like good baserunning, the effect of a base-stealing threat
      on the pitches a batter sees, the effect a strong-armed right
      fielder has on a batter in a sacrifice fly situation are difficult
      to measure, at best. But these things are far less important in
      measuring run production that the unmeasured contributions in
      basketball are in measuring point production.

      This possessions-based point production approach is one of the
      concepts we have borrowed from baseball and Bill James. And IMO I
      don't think it fits as well in basketball. So many things are
      unmeasured in basketball that when coming up with a linear weight to
      put in front of blocks, we don't want to only account for the missed
      shot it produces (and its appropriate probability of becoming a
      defensive rebound). We also want that weight to reflect the lower
      field goal percentage for those guys who arch the ball higher when
      coming in the lane or who don't come in the lane for fear that their
      shot will be blocked. On the other hand, we want it to reflect that
      guys who block shots may be more susceptible to pump fakes.

      Offensive rebounds without question help a team, but that is not the
      only consideration when trying to figure out what linear weight to
      put in front of offensive rebounds. What about if guys who get lots
      of offensive rebounds tend to be non-factors on offense who clog the
      lane and make it harder for their teammates to score? If that is
      the case, it is possible that the proper linear weight for an
      offensive rebound could even be negative.

      Similar arguments could be made for negative linear weights on
      steals or positive linear weights on free throw attempts or three
      point attempts or personal fouls. The point here is that the proper
      linear weights boil down to be being an empirical question and not a
      matter of logic. The question is how can we estimate the proper
      linear weights?

      (And this ignores the whole issue of the proper functional form, but
      we will leave that for another day.)

      This gets us to the third approach, using some type of estimation
      approach to estimate the proper linear weights. This is the
      approach that David Berri used to estimate linear weights. He ran
      regressions using team-level data and then applied the weights
      estimated from that data to individual players. This is another
      technique borrowed from baseball, and IMO another case where an
      approach that works well in baseball does not work very well in

      At the team-level, the benefit of an assisted basket is fully
      subsumed in a better field goal percentage. The issue of how much
      of a field goal to attribute to the assist just doesn't come up.
      The benefits of having players who use a lot of possessions fairly
      efficiently allowing their teammates to stick to more high
      percentage field goal attempts also does not come up. Ironically,
      much of what counts as team play is ignored using this team-level
      approach to estimating linear weights. (And this ignores other
      complications that this approach entails.) All told, these problems
      are pretty severe and I would not be surprised if the results of
      this approach were worse than those for even something like the
      NBA's efficiency statistic.

      So this gets me to the approach that I outlined in my WINVAL stuff.


      The approach basically estimates plus/minus ratings adjusted for
      home court advantage and then other players sharing the floor with
      the given player. Then I regress these adjusted plus/minus ratings
      on various statistics (adjusted for pace).

      Here is what I get. (The estimated linear weights are in the

      PTSP40 = points scored per 40 minutes
      FG2AP40 = two point field goals attempted per 40 minutes
      TAP40 = three pointers attempted per 40 minutes
      FTAP40 = free throws attempted per 40 minutes
      ASP40 = assists per 40 minutes
      ORP40 = offensive rebounds per 40 minutes
      DRP40 = defensive rebounds per 40 minutes
      TOP40 = turnovers per 40 minutes
      STP40 = steals per 40 minutes
      BKP40 = blocks per 40 minutes
      PFP40 = personal fouls per 40 minutes

      Root MSE 183.20337 R-square 0.4407
      Dep Mean 4.88730 Adj R-sq 0.4201
      C.V. 3748.56273

      Parameter Estimates

      Parameter Standard T for H0:
      Variable DF Estimate Error Parameter=0 Prob > |T|

      INTERCEP 1 -11.792794 2.20556250 -5.347 0.0001
      PTSP40 1 0.935041 0.25573422 3.656 0.0003
      FG2AP40 1 -0.726939 0.25623989 -2.837 0.0049
      TAP40 1 -0.128389 0.33902488 -0.379 0.7052
      FTAP40 1 0.104796 0.32231296 0.325 0.7453
      ASP40 1 0.795411 0.19939588 3.989 0.0001
      ORP40 1 0.458335 0.43008193 1.066 0.2874
      DRP40 1 0.570479 0.22121421 2.579 0.0104
      TOP40 1 -0.717767 0.56808664 -1.263 0.2074
      STP40 1 2.482385 0.57789004 4.296 0.0001
      BKP40 1 2.021543 0.41889403 4.826 0.0001
      PFP40 1 -0.004383 0.32491006 -0.013 0.9892

      As you can tell, these estimates differ in a lot of ways from the
      linear weights commonly used. Steals and blocks are much more
      heavily weighted. Rebounds are weighted less. Free throw attempts
      seem to have the wrong sign, and personal fouls almost do.

      But again the approach here is different. The question I am asking
      here is not whether when a guy is at line and misses it, should it
      count against him? (In this regression it does not.) The question
      is do players who get to the foul line and miss tend, holding the
      other statistics constant, tend to result in that team's players
      score more than its opponents. That is a very different question
      than the possessions-based approach takes.

      Now this approach is not the be all and end all, and I think there
      are hybrids of the two approaches that may be better. For example,
      the possessions-based statistics may be better to put in this
      regression that what I currently have in there. And there are
      things that focusing on little chunks of time misses, such as fouls
      generated that help in later chunks of the game.

      But the bigger point is this. Given the large fraction of
      unmeasured contributions in basketball, IMO linear weights are
      really an empirical questions and logic can only get us so far.

      And with that, the prosecution rests. :)
    • dan_t_rosenbaum
      I think wimpds has a point here. If a particular player with limited offensive skills tends to play at times when his teammates shoot more poorly (because they
      Message 42 of 42 , Mar 30, 2004
        I think wimpds has a point here.

        If a particular player with limited offensive skills tends to play
        at times when his teammates shoot more poorly (because they are in
        essence playing 4 on 5) than their WINVAL ratings suggest they
        should, that player would end up with more offensive rebounding
        opportunities than a typical player.

        But let's suppose a guy is on the floor for 80 possessions in 40
        minutes and he drops his team's shooting percentage down 5
        percentage points (a pretty big drop). I would assume that would
        result in about three to five extra offensive rebounding
        opportunities for this player and if he collects 10 percent of those
        extra rebounds, this amounts to 0.3 to 0.5 extra offensive rebounds
        per 40 minutes. Not trivial, but also not a huge difference.

        --- In APBR_analysis@yahoogroups.com, "wimpds" <wimpds@y...> wrote:
        > But if teammates are more likely to miss shots when they play with
        > than when they play with other players, that will depress your
        > rating, right? If this attribute is correlated with individual
        > offensive rebounding, won't it have a negative effect on the value
        > the coefficent?
        > On the defensive side, we only have a few variables to contribute
        > significantly to your WinVal rating. [I'd be interested in seeing
        > regressions run separately on the offensive and defensive WinVal
        > ratings.] Defensive Rebounds coefficient might be picking up
        > something more than simply rebounding ability, it could be getting
        > defense somewhat more generally.
        > --- In APBR_analysis@yahoogroups.com, "Michael Tamada"
        > wrote:
        > > It's a good thought, although the regression should already
        > > account for this by already measuring the impact of
        > > opponents missing shots and one's team missing shots.
        > >
        > >
        > > --MKT
        > >
        > >
        > > -----Original Message-----
        > > From: wimpds [mailto:wimpds@y...]
        > > Sent: Monday, March 29, 2004 3:13 PM
        > > To: APBR_analysis@yahoogroups.com
        > > Subject: [APBR_analysis] Re: The Problem with Possessions-Based
        > > Weights
        > >
        > >
        > > Another thought on this subject ( though I may not be thinking
        > > the regression carefully enough) is that offensive rebounds are
        > > probably correlated with having teammates missing shots while
        > > defensive rebounds are correlated with the opponent missing
        > > This effect would seem to be even stronger if your teammates are
        > > likely to miss because you're not attracting any defensive
        > >
        > >
        > >
        > >
        > >
        > > --- In APBR_analysis@yahoogroups.com, "carlos12155"
        > > <carlosmanuel@b...> wrote:
        > > > --- In APBR_analysis@yahoogroups.com, "Mike G" <msg_53@h...>
        > > > > --- In APBR_analysis@yahoogroups.com, "Kevin Pelton"
        > > > > <kpelton08@h...> wrote:
        > > > > >
        > > > > > Rodman specialized in rebounding -- not offensive
        rebounding. He
        > > > > was
        > > > > > great at both ends. I don't know if I can think of someone
        > > > > > specializes specifically in offensive rebounding.
        > > > >
        > > > > The original poster meant, I'm sure, that on Offense, Rodman
        > > > > little more than rebound missed shots, at least by the time
        in his
        > > > > career when he was with the Bulls.
        > > > >
        > > > > Obviously, he went to the other end of the court on defense,
        > did
        > > > > other things there.
        > > > >
        > > >
        > > > That was exactly my intention. It seemed to me that if
        > > > rebounds are at some level a negative, Rodman performance
        would be the
        > > > perfect example; after all that was the only thing he did on
        > >
        > >
        > >
        > >
        > > Yahoo! Groups Links
      Your message has been successfully submitted and would be delivered to recipients shortly.