Loading ...
Sorry, an error occurred while loading the content.

Re: Similarity Scores: First Cut (long)

Expand Messages
  • Dean Oliver
    There were slight errors in my calcs on the previous post. I m working on slight tweaks to the method (weights are too heavy on blocks and steals, I think,
    Message 1 of 2 , Sep 13, 2001
    • 0 Attachment
      There were slight errors in my calcs on the previous post. I'm
      working on slight tweaks to the method (weights are too heavy on
      blocks and steals, I think, and maybe adding in points). I'll fix
      everything when modifications are done.


      --- In APBR_analysis@y..., "Dean Oliver" <deano@t...> wrote:
      >
      > I looked a little more at James' rules and figured that I wouldn't
      > take a more rigorous cut and just adapt what he had. I decided to
      > start with season comparisons, since that is the easiest (and I have
      > a number of player seasons back to '89 in my db).
      >
      > Specifically, I noted that James compared player-seasons by first
      > starting with games. For every 5 games difference in player season,
      > James took off 1 point. Well, a baseball season is roughly half the
      > length, so I said 2.5 games difference in basketball player season.

      > From there, many of James' numbers appear to be game-average
      related.
      > For instance, there are about 3.5 at-bats per game, and he seemed
      to
      > multiply the factor of 5 times 3.5 because, for at-bats, every
      > difference of 20 took off 1 point. He did this for runs, hits, etc.
      > It appears that he skewed things a little toward the high side,
      > though. Like he rounded up to 20, he would round up a lot. For
      > correlated numbers (like doubles and triples are correlated to
      hits),
      > he rounded up much more. An average baseball player probably has
      > 20-25 doubles, but James subtracted 1 point for every difference of
      > 1.5 doubles (1.5/5*160 ~ 50 > 20). Then he puts his own subjective
      > weights in there, too. Differences in strikeouts aren't as
      important
      > to him as one might expect.
      >
      > So here are the first set of assignments I made:
      >
      > Pos ???
      > G 2.5
      > Min 75
      > fgm 12
      > fga 27
      > fg% 0.001
      > fg3m 3
      > fg3a 9
      > fg3% 0.0035
      > ftm 7.5
      > fta 10
      > ft% 0.002
      > oreb 7.5
      > dreb 19
      > ast 7.5
      > stl 1
      > tov 6
      > blk 1
      > pf 10
      >
      > (I make no adjustment for position at this point, in part because
      > it's less well-defined for basketball, in part because some
      defensive
      > stats are listed, which was not the case for baseball.)
      >
      > For every 2.5 game difference in a player's season, I subtract 1
      > point off of 1000. For every 75 minutes, subtract 1 point. And so
      > on.
      >
      > The hard ones to define were the values for percentages. I
      basically
      > estimated standard deviations for all my players and saw that the SD
      > for fg3% was about 3.5x that of the SD for fg% and multiplied.
      >
      > Frankly, it seems like a decent start. I first worked with Kobe
      > Bryant's 2001 season. (You'll notice that comparisons to the 1999
      > shortened season never show up.) Here is the list of most similar
      > seasons:
      >
      > Player Team Season Score
      > Hill,Grant det 2000 871
      > richmond,mitch gol 1991 865
      > Bryant,Kobe lal 2000 865
      > wilkins,dominiq lac 1994 849
      > robinson,glenn mil 2001 842
      > richmond,mitch gol 1990 832
      > richmond,mitch sac 1994 831
      > tripucka,kelly cha 1989 830
      > mullin,chris gol 1990 828
      > barkley,charles pho 1996 822
      >
      > A lot of decent players here, but no perfect matches (no Jordan or
      > Vince Carter, either). You'll recall the scale:
      >
      > <500 -- players who would not usually be perceived as being
      > essentially similar.
      > ~600 -- Slight similarities, but major differences
      > ~700 -- Important, easily identifiable similarities but also
      > significant and obvious differences
      > ~800 -- very prominent, obvious similarities, but easily
      identifiable
      > distinctions.
      > >850 -- substantially similar
      > >900 -- very similar
      > >950 -- rare, indicating that true similarities are emphasized by
      > random chance.
      >
      > Now here is Shaq 2001 comparison:
      >
      > Player Team Season Score
      > o'neal,shaquill orl 1995 915
      > O'Neal,Shaquill lal 2000 914
      > o'neal,shaquill orl 1994 855
      > webber,chris gsw 1994 774
      > mutombo,dikembe den 1994 762
      > o'neal,shaquill lal 1998 752
      > duncan,tim san 1998 738
      > robinson,david san 1990 725
      > o'neal,shaquill orl 1993 715
      > Mourning,Alonzo mia 2000 715
      >
      >
      > He's similar to himself a lot of times. But really, no one is that
      > similar to him. Webber, Mutombo, Duncan, Robinson, and Mourning
      have
      > some similarity. I kinda like that.
      >
      > Jamal Mashburn's 2001 season (who is a good, but not great player):
      >
      > Player Team Season Score
      > payton,gary sea 2001 899
      > jackson,jimmy njn 1997 879
      > hardaway,tim mia 1998 868
      > drexler,clyde hou 1998 863
      > marbury,stephon min 1997 862
      > finley,michael dal 1998 860
      > johnson,larry cha 1995 860
      > finley,michael dal 2001 858
      > hardaway,tim gsw 1993 856
      > anderson,kenny njn 1995 856
      >
      > Here is Eric Williams' 2001 season:
      >
      > Player Team Season Score
      > Fisher,Derek lal 2000 907
      > cummings,vontee gsw 2001 889
      > rivers,doc san 1996 884
      > horry,robert lal 2001 877
      > johnson,anthony sac 1998 875
      > roth,scott min 1990 869
      > rivers,doc san 1995 866
      > johnson,dermarr atl 2001 864
      > hunter,lindsey det 1996 854
      > mashburn,jamal mia 1997 853
      >
      > Williams is kind of an ordinary player. I think I'd want higher
      > scores to reflect general similarity with an ordinary guy. I think,
      > in general, my scores are a little too low. (Not even mentioning
      > that it is a very eclectic group that is similar to him here.)
      >
      > I need to do a little bit of side-by-side seasonal comparison of #'s
      > (something I didn't do up here for you) to see where my subjective
      > comparisons would change the scores. But I am generally pretty
      happy
      > with this first cut. There will be some tweaking, I'm sure, to make
      > me a lot happier -- and, given that my travel plans have been
      trashed
      > this week, I might be able to work on the 2nd cut.
      >
      > Some other comparisons for fun.
      >
      > Jordan '91:
      >
      > Player Team Season SimScore
      > jordan,michael chi 1992 891
      > jordan,michael chi 1990 844
      > chambers,tom pho 1990 842
      > jordan,michael chi 1989 831
      > jordan,michael chi 1993 820
      > mullin,chris gol 1991 805
      > jordan,michael chi 1996 799
      > Malone,Karl uta 2000 792
      > mullin,chris gsw 1992 787
      > richmond,mitch gol 1990 783
      >
      > Olajuwon '94:
      >
      > Player Team Season SimScore
      > ewing,patrick nyk 1993 823
      > ewing,patrick nyk 1994 819
      > ewing,patrick nyk 1995 807
      > ewing,patrick nyk 1990 804
      > ewing,patrick nyk 1992 795
      > olajuwon,hakeem hou 1996 787
      > kemp,shawn sea 1997 781
      > malone,karl uta 1995 780
      > baker,vin mil 1997 779
      > mourning,alonzo cha 1995 779
      >
      > Kemp '96:
      >
      > Player Team Season SimScore
      > kemp,shawn sea 1997 837
      > kemp,shawn sea 1995 819
      > kemp,shawn sea 1994 798
      > malone,karl uta 1990 779
      > thorpe,otis hou 1991 773
      > mourning,alonzo cha 1995 763
      > olajuwon,hakeem hou 1994 761
      > ewing,patrick nyk 1990 747
      > malone,karl uta 1989 745
      > mourning,alonzo mia 1996 742
      >
      > Notice no similarity to Kemp post-'97. Hmmm.
      >
      > Reggie Miller '96:
      >
      > Player Team Season SimScore
      > hawkins,hersey phi 1993 901
      > miller,reggie ind 1993 897
      > hawkins,hersey phi 1992 895
      > richmond,mitch sac 1998 894
      > porter,terry por 1992 889
      > elliott,sean san 1996 882
      > miller,reggie ind 1995 878
      > Allen,Ray mil 2000 875
      > miller,reggie ind 1998 875
      > hawkins,hersey sea 1996 868
      >
      >
      > Last, but not least, Iverson 2001:
      >
      > Player Team Season SimScore
      > Iverson,Allen phi 2000 865
      > sprewell,latrel gsw 1996 825
      > mashburn,jamal dal 1995 813
      > Stackhouse,Jerr det 2000 798
      > marbury,stephon njn 2001 795
      > stoudamire,damo por 1998 795
      > sprewell,latrel gsw 1995 794
      > sprewell,latrel gsw 1994 793
      > Marbury,Stephon njn 2000 790
      > dumars,joe det 1995 783
      >
      > Weird mix, but no other MVP like seasons....
      >
      >
      > Dean Oliver
      > Journal of Basketball Studies
    Your message has been successfully submitted and would be delivered to recipients shortly.