229Re: Similarity Scores
- Sep 10, 2001--- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:
> Just some thoughts off the top of my head: I agree that this issomething
> that ought to be done for basketball. I'm a little leery of thework
> that's been done in baseball, as Bill James' formula looks like ato see
> hand-cooked one. I'm sure its results are reasonable, but I'd like
> an approach that's more systematic.I've asked Bill about the approach he took and why he took it. I
don't expect to hear back anytime soon.
> There are several statistical techniques which I think are directlyuseful
> here. Cluster analysis (the computer looks at the data and dividesthem
> into groups); discriminant analysis and logistical regression(determine
> the factors which predict which group a player will be in, e.g.Hall of
> Fame vs non Hall of Fame); and probably the most useful of all:others.
> Mahalanobis distance, or some variation thereof.
> However, Mahalonobis distance does NOT take into account the second
> problem, that certain variables might deserve more weight than
> Maybe we WOULD want to "double count" PTs scored, given that it'sleast one
> probably the single most important statistic for a player, or at
> of the most important ones for distinguishing all-stars and hall offamers
> from journeymen.First -- I don't know of Mahalonobis distance stuff. Sounds like
multivariate regression, though. You may have to do this analysis or
point me at software that does it. What is it going to spit out?
Weights on the different stats?
> Anyway, that's how I would approach the problem. For a first pass,
> statistics per game, rather than per 48 minutes, per year, or percareer
> (career stats would be useless for comparing, say, Steve Francis toIsiah
> Thomas, because Francis' career stats are still so low).We could compare Thomas' first 2 years to Francis', a very useful
> Mahalanobis distance at first, throwing in all variables (FTA, FTM,AND
> FT%) and see if the results looked reasonable. If not, then atleast I'd
> have some coefficients to start with, and could start doubling orhalving
> some.I agree, I think (not knowing exactly what the coefficients mean).
Getting a common starting point is the most important thing I want to
get out of this discussion. Similarity scores are ultimately
somewhat subjective, but if we can all start with the same set of
numbers, at least we have a foundation.
> Also, after the initial analysis, I'd want to put in some sort ofwhatever
> correction for era or game pace. Bob Cousy's 43% career FG% (or
> it was, I'm saying this off the top of my head) reminds me more ofIsiah
> Thomas's 46% than it does Alan Iverson's 43%. Despite thesuperficial
> similarity of Cousy's and Iverson's FG%. (Again I'm not vouchingfor
> those specific numbers, just saying that I'd rather see the numbersin
> context, i.e. corrected for era and/or game pace.)I think we want to keep the era correction separate. I can't find
where he said it, but I know James wanted to keep it separate.
My one attempt at basketball similarity scores is buried somewhere.
I looked at player-season comparisons back in '98. The motivation
was identifying who was similar to Kobe Bryant, since there was so
much controversy at the time about how good he was going to be.
I remember finding a lot more self-similarity across players than
cross-similarity (Bryant's 2nd year resembled his first more than
it resembled a lot of other players' seasons, for example). The
player who seemed most similar to Bryant at the time was Allen
Iverson, but my #'s were weird. I'd say now that Bryant and Iverson
aren't as similar as Bryant and Jordan or Bryant and VCarter, which I
think harks at factors I didn't consider -- height and position. (It
may also hark at false impressions. Jordan's early career numbers
are MUCH better than Bryant's, even if you account for the 10% drop
in pace and the 5-8% drop in offensive efficiency.)
For kicks, here is Iverson 99 and Bryant 2001:
GS MIN FG FGA FG% fg3m fg3a fg3%
Iverson 1.0 41.5 9.1 22.0 0.412 1.2 4.1 0.291
Bryant 1.0 40.9 10.3 22.2 0.464 0.9 2.9 0.305
FT FTA FT% OR DR TR AST PF DQ
7.4 9.9 0.751 1.4 3.5 4.9 4.6 2.0 0.0
7.0 8.2 0.853 1.5 4.3 5.9 5.0 3.3 0.0
STL TO BLK PTS
2.29 3.48 0.15 26.8
1.68 3.24 0.63 28.5
A priori, I'd like to call these two seasons in the 700-750 range on
similarity scores. Easily identifiable similarities, but significant
and obvious differences.
For my #'s, I have (in per game stats) for Iverson and Bryant, resp:
Defensive Stops Def. Net
ScPoss Poss Fl% Ortg PtsProd /Min /Poss Rtg. Win%
12.8 25.0 0.511 106.6 26.7 0.182 0.484 97.3 0.820
12.8 24.4 0.524 110.9 27.0 0.191 0.494 103.0 0.773
At about the same age, Bryant's offensive skills are more efficient
than Iverson's, which we would probably all agree on. Defensively,
it's hard to say.
Journal of Basketball Studies
- << Previous post in topic Next post in topic >>