- Sep 10, 2001The concept of similarity scores is one that Bill James viewed as one
of his most important. His introduction to the method in the '86
The most important new method to be introduced this year is that of
similarity scores. Similarity scores are a way of objectively fixing
the "degree of resemblance" between two players or between two teams.
Among all the methods that I have developed over the years, this
method is the most flexible, the most adaptable, the most useful in
many different contexts....
The similarity scores begin with the assumption that players who are
identical in all respects considered will have a similarity score of
1000. For each difference between the two, there is a "penalty", or
reduction from the 1000. Similarity scores are designed so that:
<500 -- players who would not usually be perceived as being
~600 -- Slight similarities, but major differences
~700 -- Important, easily identifiable similarities but also
significant and obvious differences
~800 -- very prominent, obvious similarities, but easily identifiable
>850 -- substantially similarrandom chance.
>900 -- very similar
>950 -- rare, indicating that true similarities are emphasized by
1. When discussing whether or not a player should be elected to the
Hall of Fame, one of the key questions to focus on -- probably the
most important question -- is who are the most similar other players
and are they in the Hall?
2. How to measure consistency from season to season
3. How do we measure the accuracy of career projection methods?
4. How to make career projections by comparing players of similar
age to others
5. Salary negotiations
6. (A baseball specific thing, involving park factors)
7. Setting control groups for studies
8. Constructing theoretical models of players/teams and identifying
real players/teams similar to the model.
We started discussing this over in APBR, but I think the details of
making this work can get technical, so I brought it here. I do think
this is a major missing factor in basketball and it frustrates me
that, as easy as it seems to do, I haven't been able to develop
something like this.
Baseballreference.com has a list of players and who they are similar
to -- something that Robert and I have talked about doing for
basketball eventually. For instance, here is Roberto Alomar
Within that page are the list of players similar to him overall, 3 of
which are in the Hall. When comparing players at age 32 to Alomar,
the list shows 8 HOFers, the other two being Pete Rose and Ryne
Sandberg, suggesting that Alomar is on track to be in the Hall (even
as a batter, since these scores don't account for defense).
describes how the scores are calculated for baseball career #s. I'd
think that we could come up with a similar method for basketball.
The 86 book has the method for comparing seasons.
One of the problems I had was with redundancy of stats. FG% is
reflected in FG and FGA, for example. James didn't worry about it
too much, but I do in basketball.
To be clear, this is not a rating tool. It doesn't tell you who is
better or worse; it tells you who is similar. In the old argument of
Shawn Kemp, perhaps we find that the most similar players to him are
all out of the HOF -- then that suggests he isn't that great. Maybe
his best season compares with those put up by Wilt, KMalone, etc.,
suggesting great seasons.
Finally, Greg Thomas took a stab at a method for player-careers back
in the spring Cage Chronicles:
He did something like MikeT was suggesting, scaling points, etc. by
some average, then subtracting differences. Kinda interesting and
not a bad attempt, but some things I'd change/review:
1. Different scale than specified by James, I think. The scores are
particularly high (Wilt Chamberlain and Arvydas Sabonis have
similarity score of 952!!)
2. Uses only points, assists, and rebounds.
3. I think he looks at per minute #'s, not career totals.
4. He standardized by era.
It's a good first attempt, but I think there is room for improvement.
Has anyone done anything like this?
(Another difficulty I've had is in making my Access db do these
Journal of Basketball Studies
- Next post in topic >>