Loading ...
Sorry, an error occurred while loading the content.

Re: [APBR_analysis] Similarity Scores

Expand Messages
  • Michael K. Tamada
    Just some thoughts off the top of my head: I agree that this is something that ought to be done for basketball. I m a little leery of the work that s been
    Message 1 of 16 , Sep 10, 2001
    • 0 Attachment
      Just some thoughts off the top of my head: I agree that this is something
      that ought to be done for basketball. I'm a little leery of the work
      that's been done in baseball, as Bill James' formula looks like a
      hand-cooked one. I'm sure its results are reasonable, but I'd like to see
      an approach that's more systematic.

      There are several statistical techniques which I think are directly useful
      here. Cluster analysis (the computer looks at the data and divides them
      into groups); discriminant analysis and logistical regression (determine
      the factors which predict which group a player will be in, e.g. Hall of
      Fame vs non Hall of Fame); and probably the most useful of all:
      Mahalanobis distance, or some variation thereof.

      Without going deeply into the nitty gritty, here's an intuitive
      description of it: it's easy enough to understand Euclidean distance,
      sqrt( X^2 + Y^2 + Z^2 + ...) where X, Y, Z, etc. are the difference
      between, say, Magic Johnson and Larry Bird in whatever variables we choose
      to look at.

      But there are problems with Euclidean distance, specfically one that
      Dean Oliver alludes to: some variables are redundant or
      partially redundant with each other,
      e.g. FG Made and Points Scored, or even Off Rebds and Def Rebds. Another
      problem is that not all variables are equally important: some probably
      should be given greater weight than others (or maybe not; even something
      like technical fouls might be of interest; maybe the most similar player
      to Rasheed Wallace would turn out to be Charles Barkley thanks to their
      techs).

      Mahalanobis distance corrects for the first problem by measuring the
      extent to which the variables are correlated with each other, and reducing
      their weight in the distance measure. I.e. once you've got FT Made and FG
      Made and 3PT FG made in the measure, you don't want to be adding PTs
      scored as yet another variable with full weight.

      However, Mahalonobis distance does NOT take into account the second
      problem, that certain variables might deserve more weight than others.
      Maybe we WOULD want to "double count" PTs scored, given that it's
      probably the single most important statistic for a player, or at least one
      of the most important ones for distinguishing all-stars and hall of famers
      from journeymen.

      Anyway, that's how I would approach the problem. For a first pass, I'd do
      statistics per game, rather than per 48 minutes, per year, or per career
      (career stats would be useless for comparing, say, Steve Francis to Isiah
      Thomas, because Francis' career stats are still so low). I'd do straight
      Mahalanobis distance at first, throwing in all variables (FTA, FTM, AND
      FT%) and see if the results looked reasonable. If not, then at least I'd
      have some coefficients to start with, and could start doubling or halving
      some.

      Also, after the initial analysis, I'd want to put in some sort of
      correction for era or game pace. Bob Cousy's 43% career FG% (or whatever
      it was, I'm saying this off the top of my head) reminds me more of Isiah
      Thomas's 46% than it does Alan Iverson's 43%. Despite the superficial
      similarity of Cousy's and Iverson's FG%. (Again I'm not vouching for
      those specific numbers, just saying that I'd rather see the numbers in
      context, i.e. corrected for era and/or game pace.)

      For Hall of Fame purposes, I think discriminant analysis or logistic or
      probit regressions are better than merely measuring distance. I did this
      once for NBA all-stars one season, the predictions were not 100% accurate
      but you could at least separate the players into three groups: clear
      all-stars, clear non-stars, and the "on the bubble" players.


      --MKT



      On Tue, 11 Sep 2001, Dean Oliver wrote:

      >
      > The concept of similarity scores is one that Bill James viewed as one
      > of his most important. His introduction to the method in the '86
      > Abstract:
      >
      > The most important new method to be introduced this year is that of
      > similarity scores. Similarity scores are a way of objectively fixing
      > the "degree of resemblance" between two players or between two teams.
      > Among all the methods that I have developed over the years, this
      > method is the most flexible, the most adaptable, the most useful in
      > many different contexts....
      >
      > The similarity scores begin with the assumption that players who are
      > identical in all respects considered will have a similarity score of
      > 1000. For each difference between the two, there is a "penalty", or
      > reduction from the 1000. Similarity scores are designed so that:
      >
      > <500 -- players who would not usually be perceived as being
      > essentially similar.
      >
      > ~600 -- Slight similarities, but major differences
      >
      > ~700 -- Important, easily identifiable similarities but also
      > significant and obvious differences
      >
      > ~800 -- very prominent, obvious similarities, but easily identifiable
      > distinctions.
      >
      > >850 -- substantially similar
      >
      > >900 -- very similar
      >
      > >950 -- rare, indicating that true similarities are emphasized by
      > random chance.
      >
      > Uses:
      >
      > 1. When discussing whether or not a player should be elected to the
      > Hall of Fame, one of the key questions to focus on -- probably the
      > most important question -- is who are the most similar other players
      > and are they in the Hall?
      >
      > 2. How to measure consistency from season to season
      >
      > 3. How do we measure the accuracy of career projection methods?
      >
      > 4. How to make career projections by comparing players of similar
      > age to others
      >
      > 5. Salary negotiations
      >
      > 6. (A baseball specific thing, involving park factors)
      >
      > 7. Setting control groups for studies
      >
      > 8. Constructing theoretical models of players/teams and identifying
      > real players/teams similar to the model.
      >
      > ------------------------------------------
      >
      > We started discussing this over in APBR, but I think the details of
      > making this work can get technical, so I brought it here. I do think
      > this is a major missing factor in basketball and it frustrates me
      > that, as easy as it seems to do, I haven't been able to develop
      > something like this.
      >
      > Baseballreference.com has a list of players and who they are similar
      > to -- something that Robert and I have talked about doing for
      > basketball eventually. For instance, here is Roberto Alomar
      >
      > http://www.baseballreference.com/a/alomaro01.shtml
      >
      > Within that page are the list of players similar to him overall, 3 of
      > which are in the Hall. When comparing players at age 32 to Alomar,
      > the list shows 8 HOFers, the other two being Pete Rose and Ryne
      > Sandberg, suggesting that Alomar is on track to be in the Hall (even
      > as a batter, since these scores don't account for defense).
      >
      > This page
      >
      > http://www.baseballreference.com/about/similarity.shtml
      >
      > describes how the scores are calculated for baseball career #s. I'd
      > think that we could come up with a similar method for basketball.
      > The 86 book has the method for comparing seasons.
      >
      > One of the problems I had was with redundancy of stats. FG% is
      > reflected in FG and FGA, for example. James didn't worry about it
      > too much, but I do in basketball.
      >
      > To be clear, this is not a rating tool. It doesn't tell you who is
      > better or worse; it tells you who is similar. In the old argument of
      > Shawn Kemp, perhaps we find that the most similar players to him are
      > all out of the HOF -- then that suggests he isn't that great. Maybe
      > his best season compares with those put up by Wilt, KMalone, etc.,
      > suggesting great seasons.
      >
      > Finally, Greg Thomas took a stab at a method for player-careers back
      > in the spring Cage Chronicles:
      >
      > http://members.aol.com/bradleyrd/feb2001.html
      >
      > He did something like MikeT was suggesting, scaling points, etc. by
      > some average, then subtracting differences. Kinda interesting and
      > not a bad attempt, but some things I'd change/review:
      >
      > 1. Different scale than specified by James, I think. The scores are
      > particularly high (Wilt Chamberlain and Arvydas Sabonis have
      > similarity score of 952!!)
      >
      > 2. Uses only points, assists, and rebounds.
      >
      > 3. I think he looks at per minute #'s, not career totals.
      >
      > 4. He standardized by era.
      >
      > It's a good first attempt, but I think there is room for improvement.
      >
      > Has anyone done anything like this?
      >
      > (Another difficulty I've had is in making my Access db do these
      > calculations easily.)
      >
      >
      > Dean Oliver
      > Journal of Basketball Studies
      >
      >
      >
      > To unsubscribe from this group, send an email to:
      > APBR_analysis-unsubscribe@yahoogroups.com
      >
      >
      >
      > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      >
      >
    • harlanzo@yahoo.com
      To me, the problem that most jumps out in triyng to create similarity scores in basketball is that the sum total of a basketball player s contributions are not
      Message 2 of 16 , Sep 10, 2001
      • 0 Attachment
        To me, the problem that most jumps out in triyng to create similarity
        scores in basketball is that the sum total of a basketball player's
        contributions are not necessarily reflected in his stats. In
        baseball each hit can be neatly quantified (ie a double is worth .5 a
        single .25 etc.). This being the case a study might still be
        interesting. I would suggest that because of how the game has
        changed that any model should really be limited to a specific era (ie
        since 90-91 season). The problem with cross-era comparisons is
        evident if you take an example of say Rolando Blackman and Allan
        Houston. They both seem like good shooter types who are very good at
        what they do but lacking in some other areas. Indeed, both scored in
        18-20 ppg on average and both had similar assist and rebound numbers
        per game. However, because (I think) the style of play in eras was
        different, Houston has a little lower shooting pct and many more 3s
        (before 1989, Blackman never hit more than 6 threes). It is
        conceivable that if Houston played in the 80s or Blackman in the late
        90s, their numbers in these categories would be similar. Whiel the
        points, assists, rebounds numbers might seem similar superficially,
        the road on which they went to acheive these stats is very different
        and I would think that any model that called them truly similar
        (without era adjustment) is not particularly accurate.

        So the two important issues that jump out at me is deciding what
        areas are pertinent to weight when deciding whether player's are
        similar. The second issue it the cross-era comparison which is a
        whole another thorny issue. I will think about them but no answer
        jumps out to me right this second.



        --- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:
        > Just some thoughts off the top of my head: I agree that this is
        something
        > that ought to be done for basketball. I'm a little leery of the
        work
        > that's been done in baseball, as Bill James' formula looks like a
        > hand-cooked one. I'm sure its results are reasonable, but I'd like
        to see
        > an approach that's more systematic.
        >
        > There are several statistical techniques which I think are directly
        useful
        > here. Cluster analysis (the computer looks at the data and divides
        them
        > into groups); discriminant analysis and logistical regression
        (determine
        > the factors which predict which group a player will be in, e.g.
        Hall of
        > Fame vs non Hall of Fame); and probably the most useful of all:
        > Mahalanobis distance, or some variation thereof.
        >
        > Without going deeply into the nitty gritty, here's an intuitive
        > description of it: it's easy enough to understand Euclidean
        distance,
        > sqrt( X^2 + Y^2 + Z^2 + ...) where X, Y, Z, etc. are the difference
        > between, say, Magic Johnson and Larry Bird in whatever variables we
        choose
        > to look at.
        >
        > But there are problems with Euclidean distance, specfically one that
        > Dean Oliver alludes to: some variables are redundant or
        > partially redundant with each other,
        > e.g. FG Made and Points Scored, or even Off Rebds and Def Rebds.
        Another
        > problem is that not all variables are equally important: some
        probably
        > should be given greater weight than others (or maybe not; even
        something
        > like technical fouls might be of interest; maybe the most similar
        player
        > to Rasheed Wallace would turn out to be Charles Barkley thanks to
        their
        > techs).
        >
        > Mahalanobis distance corrects for the first problem by measuring
        the
        > extent to which the variables are correlated with each other, and
        reducing
        > their weight in the distance measure. I.e. once you've got FT Made
        and FG
        > Made and 3PT FG made in the measure, you don't want to be adding PTs
        > scored as yet another variable with full weight.
        >
        > However, Mahalonobis distance does NOT take into account the second
        > problem, that certain variables might deserve more weight than
        others.
        > Maybe we WOULD want to "double count" PTs scored, given that it's
        > probably the single most important statistic for a player, or at
        least one
        > of the most important ones for distinguishing all-stars and hall of
        famers
        > from journeymen.
        >
        > Anyway, that's how I would approach the problem. For a first pass,
        I'd do
        > statistics per game, rather than per 48 minutes, per year, or per
        career
        > (career stats would be useless for comparing, say, Steve Francis to
        Isiah
        > Thomas, because Francis' career stats are still so low). I'd do
        straight
        > Mahalanobis distance at first, throwing in all variables (FTA, FTM,
        AND
        > FT%) and see if the results looked reasonable. If not, then at
        least I'd
        > have some coefficients to start with, and could start doubling or
        halving
        > some.
        >
        > Also, after the initial analysis, I'd want to put in some sort of
        > correction for era or game pace. Bob Cousy's 43% career FG% (or
        whatever
        > it was, I'm saying this off the top of my head) reminds me more of
        Isiah
        > Thomas's 46% than it does Alan Iverson's 43%. Despite the
        superficial
        > similarity of Cousy's and Iverson's FG%. (Again I'm not vouching
        for
        > those specific numbers, just saying that I'd rather see the numbers
        in
        > context, i.e. corrected for era and/or game pace.)
        >
        > For Hall of Fame purposes, I think discriminant analysis or
        logistic or
        > probit regressions are better than merely measuring distance. I
        did this
        > once for NBA all-stars one season, the predictions were not 100%
        accurate
        > but you could at least separate the players into three groups:
        clear
        > all-stars, clear non-stars, and the "on the bubble" players.
        >
        >
        > --MKT
        >
        >
        >
        > On Tue, 11 Sep 2001, Dean Oliver wrote:
        >
        > >
        > > The concept of similarity scores is one that Bill James viewed as
        one
        > > of his most important. His introduction to the method in the '86
        > > Abstract:
        > >
        > > The most important new method to be introduced this year is that
        of
        > > similarity scores. Similarity scores are a way of objectively
        fixing
        > > the "degree of resemblance" between two players or between two
        teams.
        > > Among all the methods that I have developed over the years, this
        > > method is the most flexible, the most adaptable, the most useful
        in
        > > many different contexts....
        > >
        > > The similarity scores begin with the assumption that players who
        are
        > > identical in all respects considered will have a similarity score
        of
        > > 1000. For each difference between the two, there is a "penalty",
        or
        > > reduction from the 1000. Similarity scores are designed so that:
        > >
        > > <500 -- players who would not usually be perceived as being
        > > essentially similar.
        > >
        > > ~600 -- Slight similarities, but major differences
        > >
        > > ~700 -- Important, easily identifiable similarities but also
        > > significant and obvious differences
        > >
        > > ~800 -- very prominent, obvious similarities, but easily
        identifiable
        > > distinctions.
        > >
        > > >850 -- substantially similar
        > >
        > > >900 -- very similar
        > >
        > > >950 -- rare, indicating that true similarities are emphasized by
        > > random chance.
        > >
        > > Uses:
        > >
        > > 1. When discussing whether or not a player should be elected to
        the
        > > Hall of Fame, one of the key questions to focus on -- probably
        the
        > > most important question -- is who are the most similar other
        players
        > > and are they in the Hall?
        > >
        > > 2. How to measure consistency from season to season
        > >
        > > 3. How do we measure the accuracy of career projection methods?
        > >
        > > 4. How to make career projections by comparing players of
        similar
        > > age to others
        > >
        > > 5. Salary negotiations
        > >
        > > 6. (A baseball specific thing, involving park factors)
        > >
        > > 7. Setting control groups for studies
        > >
        > > 8. Constructing theoretical models of players/teams and
        identifying
        > > real players/teams similar to the model.
        > >
        > > ------------------------------------------
        > >
        > > We started discussing this over in APBR, but I think the details
        of
        > > making this work can get technical, so I brought it here. I do
        think
        > > this is a major missing factor in basketball and it frustrates me
        > > that, as easy as it seems to do, I haven't been able to develop
        > > something like this.
        > >
        > > Baseballreference.com has a list of players and who they are
        similar
        > > to -- something that Robert and I have talked about doing for
        > > basketball eventually. For instance, here is Roberto Alomar
        > >
        > > http://www.baseballreference.com/a/alomaro01.shtml
        > >
        > > Within that page are the list of players similar to him overall,
        3 of
        > > which are in the Hall. When comparing players at age 32 to
        Alomar,
        > > the list shows 8 HOFers, the other two being Pete Rose and Ryne
        > > Sandberg, suggesting that Alomar is on track to be in the Hall
        (even
        > > as a batter, since these scores don't account for defense).
        > >
        > > This page
        > >
        > > http://www.baseballreference.com/about/similarity.shtml
        > >
        > > describes how the scores are calculated for baseball career #s.
        I'd
        > > think that we could come up with a similar method for
        basketball.
        > > The 86 book has the method for comparing seasons.
        > >
        > > One of the problems I had was with redundancy of stats. FG% is
        > > reflected in FG and FGA, for example. James didn't worry about
        it
        > > too much, but I do in basketball.
        > >
        > > To be clear, this is not a rating tool. It doesn't tell you who
        is
        > > better or worse; it tells you who is similar. In the old
        argument of
        > > Shawn Kemp, perhaps we find that the most similar players to him
        are
        > > all out of the HOF -- then that suggests he isn't that great.
        Maybe
        > > his best season compares with those put up by Wilt, KMalone,
        etc.,
        > > suggesting great seasons.
        > >
        > > Finally, Greg Thomas took a stab at a method for player-careers
        back
        > > in the spring Cage Chronicles:
        > >
        > > http://members.aol.com/bradleyrd/feb2001.html
        > >
        > > He did something like MikeT was suggesting, scaling points, etc.
        by
        > > some average, then subtracting differences. Kinda interesting
        and
        > > not a bad attempt, but some things I'd change/review:
        > >
        > > 1. Different scale than specified by James, I think. The scores
        are
        > > particularly high (Wilt Chamberlain and Arvydas Sabonis have
        > > similarity score of 952!!)
        > >
        > > 2. Uses only points, assists, and rebounds.
        > >
        > > 3. I think he looks at per minute #'s, not career totals.
        > >
        > > 4. He standardized by era.
        > >
        > > It's a good first attempt, but I think there is room for
        improvement.
        > >
        > > Has anyone done anything like this?
        > >
        > > (Another difficulty I've had is in making my Access db do these
        > > calculations easily.)
        > >
        > >
        > > Dean Oliver
        > > Journal of Basketball Studies
        > >
        > >
        > >
        > > To unsubscribe from this group, send an email to:
        > > APBR_analysis-unsubscribe@y...
        > >
        > >
        > >
        > > Your use of Yahoo! Groups is subject to
        http://docs.yahoo.com/info/terms/
        > >
        > >
      • Dean Oliver
        ... something ... work ... to see ... I ve asked Bill about the approach he took and why he took it. I don t expect to hear back anytime soon. ... useful ...
        Message 3 of 16 , Sep 10, 2001
        • 0 Attachment
          --- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:
          > Just some thoughts off the top of my head: I agree that this is
          something
          > that ought to be done for basketball. I'm a little leery of the
          work
          > that's been done in baseball, as Bill James' formula looks like a
          > hand-cooked one. I'm sure its results are reasonable, but I'd like
          to see
          > an approach that's more systematic.
          >

          I've asked Bill about the approach he took and why he took it. I
          don't expect to hear back anytime soon.

          > There are several statistical techniques which I think are directly
          useful
          > here. Cluster analysis (the computer looks at the data and divides
          them
          > into groups); discriminant analysis and logistical regression
          (determine
          > the factors which predict which group a player will be in, e.g.
          Hall of
          > Fame vs non Hall of Fame); and probably the most useful of all:
          > Mahalanobis distance, or some variation thereof.

          > However, Mahalonobis distance does NOT take into account the second
          > problem, that certain variables might deserve more weight than
          others.
          > Maybe we WOULD want to "double count" PTs scored, given that it's
          > probably the single most important statistic for a player, or at
          least one
          > of the most important ones for distinguishing all-stars and hall of
          famers
          > from journeymen.

          First -- I don't know of Mahalonobis distance stuff. Sounds like
          multivariate regression, though. You may have to do this analysis or
          point me at software that does it. What is it going to spit out?
          Weights on the different stats?

          >
          > Anyway, that's how I would approach the problem. For a first pass,
          I'd do
          > statistics per game, rather than per 48 minutes, per year, or per
          career
          > (career stats would be useless for comparing, say, Steve Francis to
          Isiah
          > Thomas, because Francis' career stats are still so low).

          We could compare Thomas' first 2 years to Francis', a very useful
          comparison.

          >I'd do
          straight
          > Mahalanobis distance at first, throwing in all variables (FTA, FTM,
          AND
          > FT%) and see if the results looked reasonable. If not, then at
          least I'd
          > have some coefficients to start with, and could start doubling or
          halving
          > some.
          >

          I agree, I think (not knowing exactly what the coefficients mean).
          Getting a common starting point is the most important thing I want to
          get out of this discussion. Similarity scores are ultimately
          somewhat subjective, but if we can all start with the same set of
          numbers, at least we have a foundation.

          > Also, after the initial analysis, I'd want to put in some sort of
          > correction for era or game pace. Bob Cousy's 43% career FG% (or
          whatever
          > it was, I'm saying this off the top of my head) reminds me more of
          Isiah
          > Thomas's 46% than it does Alan Iverson's 43%. Despite the
          superficial
          > similarity of Cousy's and Iverson's FG%. (Again I'm not vouching
          for
          > those specific numbers, just saying that I'd rather see the numbers
          in
          > context, i.e. corrected for era and/or game pace.)
          >

          I think we want to keep the era correction separate. I can't find
          where he said it, but I know James wanted to keep it separate.

          My one attempt at basketball similarity scores is buried somewhere.
          I looked at player-season comparisons back in '98. The motivation
          was identifying who was similar to Kobe Bryant, since there was so
          much controversy at the time about how good he was going to be.
          I remember finding a lot more self-similarity across players than
          cross-similarity (Bryant's 2nd year resembled his first more than
          it resembled a lot of other players' seasons, for example). The
          player who seemed most similar to Bryant at the time was Allen
          Iverson, but my #'s were weird. I'd say now that Bryant and Iverson
          aren't as similar as Bryant and Jordan or Bryant and VCarter, which I
          think harks at factors I didn't consider -- height and position. (It
          may also hark at false impressions. Jordan's early career numbers
          are MUCH better than Bryant's, even if you account for the 10% drop
          in pace and the 5-8% drop in offensive efficiency.)

          For kicks, here is Iverson 99 and Bryant 2001:

          GS MIN FG FGA FG% fg3m fg3a fg3%
          Iverson 1.0 41.5 9.1 22.0 0.412 1.2 4.1 0.291
          Bryant 1.0 40.9 10.3 22.2 0.464 0.9 2.9 0.305

          FT FTA FT% OR DR TR AST PF DQ
          7.4 9.9 0.751 1.4 3.5 4.9 4.6 2.0 0.0
          7.0 8.2 0.853 1.5 4.3 5.9 5.0 3.3 0.0

          STL TO BLK PTS
          2.29 3.48 0.15 26.8
          1.68 3.24 0.63 28.5

          A priori, I'd like to call these two seasons in the 700-750 range on
          similarity scores. Easily identifiable similarities, but significant
          and obvious differences.

          For my #'s, I have (in per game stats) for Iverson and Bryant, resp:

          Defensive Stops Def. Net
          ScPoss Poss Fl% Ortg PtsProd /Min /Poss Rtg. Win%
          12.8 25.0 0.511 106.6 26.7 0.182 0.484 97.3 0.820
          12.8 24.4 0.524 110.9 27.0 0.191 0.494 103.0 0.773

          At about the same age, Bryant's offensive skills are more efficient
          than Iverson's, which we would probably all agree on. Defensively,
          it's hard to say.

          Dean Oliver
          Journal of Basketball Studies
        • Dean Oliver
          ... This being the case a study might still be ... (ie ... I think we should start here, but not limit it this way. Your example of Houston and Blackman is
          Message 4 of 16 , Sep 10, 2001
          • 0 Attachment
            --- In APBR_analysis@y..., harlanzo@y... wrote:
            This being the case a study might still be
            > interesting. I would suggest that because of how the game has
            > changed that any model should really be limited to a specific era
            (ie
            > since 90-91 season).

            I think we should start here, but not limit it this way.

            Your example of Houston and Blackman is interesting. I think players
            like Blackman did evolve into players like Houston, but their styles
            were/are different. There weren't many Houston-types in the '80's.
            We want to show that. In some cases, we may want to hide that, but
            we don't want to hide it all the time. It points out the problem we
            always have -- that players from the '60's are more similar to
            themselves than they are to today's players. As much as we might
            like to compare Bob Pettit to Karl Malone, I'm sure more players in
            the '50's-'60's are similar to Pettit than Malone is.

            I would call the similarity (on a per game basis) between Houston and
            Blackman about an 800, just gut feel. Career-wise, Houston has a
            ways to go to get to Blackman's level. Even at age 30, it appears
            that Blackman had a bit better numbers.

            Here is the list of comparisons done for the newsletter with the
            original scores assigned and some of my subjective scores:

            Orig MyEst Players
            990 850 Isiah Thomas & Tim Hardaway
            986 800 Julius Erving & Elgin Baylor
            985 Mark Aguirre & Alex English
            984 850 Patrick Ewing & Alonzo Mourning
            976 Kareem Abdul-Jabbar & Bob Pettit
            976 800 David Robinson & Tim Duncan
            969 Willis Reed & Walt Bellamy
            968 850 Reggie Miller & Allan Houston
            968 Kevin Johnson & Stephon Marbury
            967 Oscar Robertson & Sam Cassell
            967 800 Bill Russell & Wes Unseld
            964 750 Karl Malone & Kareem Abdul-Jabaar
            964 Kareem Abdul-Jabbar & Bob Lanier
            963 David Robinson & Hakeem Olajuwon
            961 Isiah Thomas & Kevin Johnson
            959 Jo Jo White & Hal Greer
            958 Jerry West & Pete Maravich
            955 Walt Frazier & Penny Hardaway
            952 600 Wilt Chamberlain & Arvydas Sabonis
            951 Dominique Wilkins & John Drew
            949 Vince Carter & Kobe Bryant
            949 800 Isiah Thomas & Stephon Marbury
            943 Larry Bird & Chris Webber
            943 Kobe Bryant & Alan Iverson
            942 Rick Barry & John Havlicek
            938 Karl Malone & David Robinson
            938 Bill Laimbeer & Dikembe Mutombo
            935 Jerry West & Paul Westphal
            930 Larry Bird & Billy Cunningham
            929 Kareem Abdul-Jabbar & Charles Barkley
            929 Charles Barkley & Kareem Abdul-Jabaar
            927 Walt Frazier & Gary Payton
            927 Karl Malone and Bob Petit
            924 Vince Carter & Alan Iverson
            922 Grant Hill & Elgin Baylor
            919 Bill Russell & Bill Walton
            918 Shaquille O'Neal & David Robinson
            917 Wilt Chamberlain & Kareem Abdul-Jabaar
            914 George Gervin & David Thompson
            903 Wilt Chamberlain & David Robinson
            902 Larry Bird & Elgin Baylor
            897 750 Jason Kidd & Magic Johnson
            888 Shaquille O'Neal & Hakeem Olajuwon
            887 Michael Jordan & Alan Iverson
            885 Charles Barkley & Karl Malone
            884 Michael Jordan & Vince Carter
            882 John Stockton & Larry Brown
            875 Jerry West & Oscar Robertson
            858 850 Shaquille O'Neal & Wilt Chamberlain
            852 800 Oscar Robertson & Magic Johnson
            848 750 Michael Jordan & Kobe Bryant
            830 750 Michael Jordan & Julius Erving
            263 Shaquille O'Neal & John Stockton
          • Michael K. Tamada
            ... [...] ... The weights are not scalar, but are instead implicitly contained in a matrix. I usually use a stat package called SPSS but I just checked and
            Message 5 of 16 , Sep 12, 2001
            • 0 Attachment
              On Tue, 11 Sep 2001, Dean Oliver wrote:

              > --- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:

              [...]

              > First -- I don't know of Mahalonobis distance stuff. Sounds like
              > multivariate regression, though. You may have to do this analysis or
              > point me at software that does it. What is it going to spit out?
              > Weights on the different stats?

              The weights are not scalar, but are instead implicitly contained in a
              matrix. I usually use a stat package called SPSS but I just checked and
              surprisingly, although Mahalanobis distance is calculated and used in a
              number of statistics that it calculates, it doesn't have a command for
              simply computing good ol' Mahalanobis distance.

              However, the formula for Mahalanobis distance is pretty simple. Let x and
              y be vectors of the variables that we're measuring, for two different
              players. E.g. "x" might be Magic's pts per game, assists per game, FG%,
              asst/TO ratio, min/game, etc. etc etc. "y" would be the same stats, but
              for Larry Bird.

              Let S stand for the covariance matrix of all players' stats. (For an
              example of how to calculate the elements of the covariance matrix, see

              http://www.itl.nist.gov/div898/handbook/pmc/section5/pmc541.htm

              ).


              Then the Mahalanobis distance is simply (x-y)S^-1(x-y) in matrix
              notation. (The "x-y" are vectors, and S^-1 is the inverse of S.)

              Here's a web-page with some other distance metrics:

              http://www.mathworks.com/access/helpdesk/help/toolbox/stats/pdist.shtml


              However all of these view all the variables as being essentially equally
              important. Hence the possible need for weighting. Or the use of some
              outside rating system or external criteria (e.g. Hall of Fame vs non Hall
              of Fame status, and we could use discriminant analysis or logistic
              regression to calculate the coefficients for predicting HoF status).


              [...career comparisons]

              >We could compare Thomas' first 2 years to Francis', a very useful
              >comparison.

              Yes, good point. Although if we're doing career totals, we'd presumably
              still want a correction for 82-game seasons vs 72-game seasons.

              > I think we want to keep the era correction separate. I can't find
              > where he said it, but I know James wanted to keep it separate.

              Yes, probably best done, as you suggest elsewhere, by having two sets of
              similarity stats: "absolute" and "relative" (or "corrected", or
              "standardized" or whatever we want to call them).


              --MKT
            • Mike Goodman
              ... Excellent move, Dean ... This is one reason I have concentrated on combining all scoring- related data into one scoring ability number. It seems quite
              Message 6 of 16 , Sep 12, 2001
              • 0 Attachment
                --- In APBR_analysis@y..., "Dean Oliver" <deano@t...> wrote:
                > We started discussing this over in APBR, but I think the details of
                > making this work can get technical, so I brought it here.

                Excellent move, Dean
                >
                > One of the problems I had was with redundancy of stats. FG% is
                > reflected in FG and FGA, for example. James didn't worry about it
                > too much, but I do in basketball.
                >
                This is one reason I have concentrated on combining all scoring-
                related data into one "scoring ability" number. It seems quite clear
                to me that "points is points", and likewise, attempts are attempts
                (or possessions used up). Thus the "scoring efficiency", which I
                believe is also a term used in another way, and which implies to me
                that it includes turnovers incurred while attempting to score,
                offensive fouls, and the "ability to get a shot off"; so I might
                prefer to call Pts/(Attempts*2) something like "scoring percentage".
                I also feel comfortable with using a player's ScoPct/.527
                (historical standard ScoPct) as a number to factor into a player's
                points-per-minute rate. I justify this by noting that a high-
                scoring, low-percent scorer on a weak team would just have to shoot
                less (and take higher-percentage shots) on a better team.
                Conversely, a low-scoring, high-percentage shooter on a good team
                would almost certainly be asked to take more shots on a weaker team.
                Generally, his percentage would go down, but possibly his "scoring
                ability" number would be fairly constant as he moves from team to
                team.
                Ty Corbin had such a career spell, as he went from a go-to guy on
                the woeful Wolves, to a contributor on the contending Jazz; his
                minutes and ppg went rollercoastering, but his measurable 'scoring
                ability' was pretty constant.

                >....In the old argument
                of
                > Shawn Kemp, perhaps we find that the most similar players to him
                are
                > all out of the HOF -- then that suggests he isn't that great.
                Maybe
                > his best season compares with those put up by Wilt, KMalone, etc.,
                > suggesting great seasons.
                >
                In one member's standardized numbers, Kemp's career 'abilities'
                are :
                21 pts, 12 reb, 2 ast, 2 blk. This compares to Artis Gilmore, Moses
                Malone. But many fewer minutes for Kemp, and lesser totals.
              • Mike Goodman
                ... ....discriminant analysis and logistical regression .... Euclidean distance, ... distance does NOT take into account the second ... others. ... A friend of
                Message 7 of 16 , Sep 12, 2001
                • 0 Attachment
                  --- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:
                  ....discriminant analysis and logistical regression .... Euclidean
                  distance,
                  > sqrt( X^2 + Y^2 + Z^2 + ...) where X, Y, Z, etc. are the difference
                  > between, say, Magic Johnson and Larry Bird ....., Mahalonobis
                  distance does NOT take into account the second
                  > problem, that certain variables might deserve more weight than
                  others.
                  >
                  A friend of mine says "Anyone who drives faster than me is a fukkin
                  maniac, and anyone who drives slower is a goddamn asshole".
                  Similarly, I say, anyone who uses less math than me is some kind of
                  moron, and whoever uses more must be some kind of geek.

                  > Also, after the initial analysis, I'd want to put in some sort of
                  > correction for era or game pace. Bob Cousy's 43% career FG% (or
                  whatever
                  > it was, I'm saying this off the top of my head) reminds me more of
                  Isiah
                  > Thomas's 46% than it does Alan Iverson's 43%. Despite the
                  superficial
                  > similarity of Cousy's and Iverson's FG%. (Again I'm not vouching
                  for
                  > those specific numbers, just saying that I'd rather see the numbers
                  in
                  > context, i.e. corrected for era and/or game pace.)

                  Cousy never once managed to make 40% of his FG during a season; his
                  career scoring pct. was .440. (Iverson's is .500; Isiah's was .508).
                  >
                  > For Hall of Fame purposes, I think discriminant analysis or
                  logistic or
                  > probit regressions are better than merely measuring distance. I
                  did this
                  > once for NBA all-stars one season, the predictions were not 100%
                  accurate
                  > but you could at least separate the players into three groups:
                  clear
                  > all-stars, clear non-stars, and the "on the bubble" players.
                  >
                  >
                  > --MKT
                  >
                  >
                  Last season, the West selected my top 11 Western players to the
                  allstar team, but skipped #12 Nowitzki in favor of teammate Michael
                  Finley (#30 or thereabouts).
                  Meanwhile the East seemed to pick at random, ignoring most forwards
                  as they had ignored all point guards the year before.
                • Mike Goodman
                  ... choose ... Another ... probably ... I tried my hand at a variation of the Euclidian distance, since I can understand the formula (and pronounce it, too). I
                  Message 8 of 16 , Sep 14, 2001
                  • 0 Attachment
                    --- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:
                    >.... Euclidean distance,
                    > sqrt( X^2 + Y^2 + Z^2 + ...) where X, Y, Z, etc. are the difference
                    > between, say, Magic Johnson and Larry Bird in whatever variables we
                    choose
                    > to look at.
                    >
                    > But there are problems with Euclidean distance, specfically one that
                    > Dean Oliver alludes to: some variables are redundant or
                    > partially redundant with each other,
                    > e.g. FG Made and Points Scored, or even Off Rebds and Def Rebds.
                    Another
                    > problem is that not all variables are equally important: some
                    probably
                    > should be given greater weight than others ...

                    I tried my hand at a variation of the Euclidian distance, since I can
                    understand the formula (and pronounce it, too).
                    I took 5 stats: scoring, rebounding, assists, steals, blocks. I used
                    my normalized (standardized) versions. Because points are much more
                    abundant than, say, steals, I reduced this difference by taking the
                    square root of each stat. I compared the top 31 players on my
                    infamous "alltime" list to the other 514 in the list. (I actually
                    ran out of columns in Excel, for the first time.)
                    The formula is drudgery to type, but it starts like this:
                    E = (sqrt(a1)-sqrt(b1))^2 + (sqrt(a2)-sqrt(b2))^2 +... and so on, up
                    to a5 and b5, for players a and b, and variables 1-5.
                    I did not take the square root of the whole thing, since everything
                    was already square-rooted once.
                    Not surprisingly, the best players only correspond to other great
                    players, but some players have much more unique statistical profiles.
                    In order of "greatest distance from the next-closest profile", we
                    have:
                    Sco Reb Ast Stl Blk E
                    Michael Jordan 33.5 6.5 5.1 2.3 .9
                    Jerry West 25.1 4.2 6.0 (2.7 .9) .945 (estimated)
                    No real surprise that Jordan is the "most unique" statistically.
                    Others scored more than West, but didn't have quality numbers beyond
                    that.
                    (Iverson is next, then Karl Malone(!), Kobe, Gervin, Erving, Bird,
                    Wilkins, Dantley, Barry)

                    Bill Russell 11.8 14.6 3.8 (1.5 4.0)
                    Bill Walton 15.9 12.8 4.0 1.0 2.7 .743
                    Really not very similar, but as close as anyone comes to Russell's
                    combination of skills.
                    (Thurmond is close 2nd, then Sam Lacey, Elmore Smith, Mutombo)

                    Magic Johnson 20.6 7.5 10.4 1.9 .4
                    Oscar Robertson 22.4 5.3 8.0 (1.5 .3) .644
                    Magic was "the next Oscar", and then some.
                    (Grant Hill, Payton, Penny, Strickland, Isiah, Drexler, KJ, Frazier)

                    John Stockton 17.1 3.3 11.9 2.4 .2
                    Isiah Thomas 18.0 3.7 8.8 2.0 .3 .543
                    Stockton is just a giant in the assists category.
                    (Tim Hardaway, KJ, Strickland, Cousy, Kenny Anderson, Brandon)

                    Jerry West 25.1 4.2 6.0 (2.7 .9)
                    Allen Iverson 25.1 3.9 5.5 2.1 .2 .517
                    Now we have some real across-the-board similarity.
                    (Barry, Penny, Kobe, Drexler, Maravich, Oscar, Westphal)

                    Oscar Robertson 22.4 5.3 8.0 (1.5 .3)
                    Penny Hardaway 20.2 5.1 6.2 1.9 .6 .486
                    (KJ, Payton, Frazier, Cassell, Tim Hardaway, Price, Brandon, Magic)

                    Moses Malone 21.6 13.2 1.3 .9 1.4
                    Shawn Kemp 20.9 11.8 2.2 1.4 1.6 .470
                    (Parish, Gilmore, Reed, McDyess, Ewing, Hayes, Haywood, McAdoo)

                    Shaquille O'Neal 29.7 12.7 2.8 .7 2.6
                    Tim Duncan 25.1 12.0 3.0 .8 2.3 .466
                    (Kareem, Robinson, Mikan, Pettit, Ewing, Mourning, Wilt, Hakeem)

                    Artis Gilmore 20.3 11.9 2.3 .6 2.3
                    Patrick Ewing 23.5 11.1 2.0 1.0 2.6 .446
                    (Hayes, Parish, Derrick Coleman, Sabonis, McDyess, Kemp, Gallatin)

                    The remainder of the top 31 (and their closest match)

                    Kareem AbdulJab. 25.9 10.6 3.4 1.0 2.7
                    Tim Duncan 25.1 12.0 3.0 .8 2.3 .288
                    (Robinson, Pettit, Mikan, Ewing, Neil Johnston, Shaq, Hakeem)

                    Wilt Chamberlain 23.5 14.7 3.5 (1.5 3.0)
                    George Mikan 24.8 13.1 2.9 (1.3 2.0) .432
                    (Hakeem, Robinson, Duncan, Pettit, Kareem, Ewing)

                    Karl Malone 28.1 11.2 3.4 1.4 .8
                    Charles Barkley 24.2 12.4 3.8 1.6 .8 .444
                    (Pettit, Johnston, Mikan, Baylor, Jeff Ruland, Bird, Duncan, McAdoo)

                    Hakeem Olajuwon 23.7 11.7 2.6 1.8 3.2
                    David Robinson 26.1 11.8 2.8 1.5 3.3 .275

                    Julius Erving 23.0 7.8 4.0 1.9 1.7
                    Elgin Baylor 22.5 9.6 3.9 (1.6 1.5) .347
                    (Webber, Marques Johnson, Shareef, Johnston, Lanier, Ed Macauley,
                    Schayes, Garnett, Bird, Drexler)

                    Patrick Ewing 23.5 11.1 2.0 1.0 2.6
                    Alonzo Mourning 24.5 10.9 1.6 .7 3.2 .332

                    Bob Pettit 24.2 11.7 2.8 (1.3 1.8)
                    George Mikan 24.8 13.1 2.9 (1.3 2.0) .231

                    Elgin Baylor 22.5 9.6 3.9 (1.6 1.5)
                    Chris Webber 21.1 10.1 4.2 1.5 1.8 .215
                    (Lanier, Erving, Schayes, Johnston, Shareef, Garnett, Pettit, McAdoo)

                    Scottie Pippen 18.4 7.5 5.4 2.1 .9
                    Clyde Drexler 20.6 6.7 5.5 2.1 .7 .306
                    (Alvan Adams, Connie Hawkins, Toni Kukoc, Billy C., Grant Hill,
                    Antoine Walker, Marques Johnson, Penny, Cliff Hagan)

                    Clyde-Scottie likewise

                    Robert Parish 18.1 11.4 1.5 .9 1.8
                    Elvin Hayes 17.8 10.9 1.7 1.0 2.6 .161
                    (Gallatin, McDyess, Seikaly, Reed, Larry Foust, Dan Roundfield,
                    Sampson, Haywood, Brian Grant)

                    Bob Lanier 21.4 10.5 3.3 1.2 1.7
                    Dolph Schayes 20.0 10.1 3.1 (1.4 1.6) .194

                    (Elvin Hayes-Robert Parish match)

                    Rick Barry 21.9 5.5 4.5 2.1 .5
                    Kobe Bryant 23.0 5.2 4.2 1.4 .8 .345
                    (Chris Mullin, Drexler, Hagan, Moncrief, Penny, Ray Allen)

                    Kevin McHale 22.1 8.6 1.8 .4 2.0
                    Rik Smits 19.9 8.3 1.8 .6 1.6 .306
                    (Lovellete, Darryl Dawkins, Haywood, McAdoo, Yardley, McDyess)

                    (George Mikan-Bob Pettit)

                    Dan Issel 21.1 8.5 2.2 1.1 .6
                    Terry Cummings 19.1 9.3 2.2 1.3 .7 .280
                    (Chambers, Ceballos, Calvin Natt, Shareef, Yardley, Glenn Robinson)

                    Clearly, as one goes down the list into more "ordinary" players,
                    there is a proliferation of close profiles.


                    Mike Goodman

                    > >
                    > >
                  • harlanzo@yahoo.com
                    It occurred to me that when comparing players through their statistics should we be weighting the comparisons so that some statistics are more important based
                    Message 9 of 16 , Sep 15, 2001
                    • 0 Attachment
                      It occurred to me that when comparing players through their
                      statistics should we be weighting the comparisons so that some
                      statistics are more important based on positions? For example, when
                      comparing point guards the assist category might be more important
                      for weighing similarity than rebound category. Conversely, do we
                      really care whether two centers have similar assist numbers if their
                      points, rebounds, and fg % are similar? I think this sounds somewhat
                      right with some notable exceptions. The counter argument of course
                      is that centers who pass well (a la Walton) or shoot 3s well
                      (Laimbeer and Sikma) are unique and the similarity scores will help
                      identify players with similar rare skill sets. (To digress, I wonder
                      if Jason Kidd and some of the Darrell Walker early 90s seasons are
                      comparable). I am beginning to babble but I think that the question
                      I am asking is whether positional demands should change how we weight
                      statistical categories when we try to apply similarity scores?


                      --- In APBR_analysis@y..., "Mike Goodman" <msg_53@h...> wrote:
                      > --- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:
                      > >.... Euclidean distance,
                      > > sqrt( X^2 + Y^2 + Z^2 + ...) where X, Y, Z, etc. are the
                      difference
                      > > between, say, Magic Johnson and Larry Bird in whatever variables
                      we
                      > choose
                      > > to look at.
                      > >
                      > > But there are problems with Euclidean distance, specfically one
                      that
                      > > Dean Oliver alludes to: some variables are redundant or
                      > > partially redundant with each other,
                      > > e.g. FG Made and Points Scored, or even Off Rebds and Def Rebds.
                      > Another
                      > > problem is that not all variables are equally important: some
                      > probably
                      > > should be given greater weight than others ...
                      >
                      > I tried my hand at a variation of the Euclidian distance, since I
                      can
                      > understand the formula (and pronounce it, too).
                      > I took 5 stats: scoring, rebounding, assists, steals, blocks. I
                      used
                      > my normalized (standardized) versions. Because points are much
                      more
                      > abundant than, say, steals, I reduced this difference by taking the
                      > square root of each stat. I compared the top 31 players on my
                      > infamous "alltime" list to the other 514 in the list. (I actually
                      > ran out of columns in Excel, for the first time.)
                      > The formula is drudgery to type, but it starts like this:
                      > E = (sqrt(a1)-sqrt(b1))^2 + (sqrt(a2)-sqrt(b2))^2 +... and so on,
                      up
                      > to a5 and b5, for players a and b, and variables 1-5.
                      > I did not take the square root of the whole thing, since everything
                      > was already square-rooted once.
                      > Not surprisingly, the best players only correspond to other great
                      > players, but some players have much more unique statistical
                      profiles.
                      > In order of "greatest distance from the next-closest profile", we
                      > have:
                      > Sco Reb Ast Stl Blk E
                      > Michael Jordan 33.5 6.5 5.1 2.3 .9
                      > Jerry West 25.1 4.2 6.0 (2.7 .9) .945 (estimated)
                      > No real surprise that Jordan is the "most unique" statistically.
                      > Others scored more than West, but didn't have quality numbers
                      beyond
                      > that.
                      > (Iverson is next, then Karl Malone(!), Kobe, Gervin, Erving, Bird,
                      > Wilkins, Dantley, Barry)
                      >
                      > Bill Russell 11.8 14.6 3.8 (1.5 4.0)
                      > Bill Walton 15.9 12.8 4.0 1.0 2.7 .743
                      > Really not very similar, but as close as anyone comes to Russell's
                      > combination of skills.
                      > (Thurmond is close 2nd, then Sam Lacey, Elmore Smith, Mutombo)
                      >
                      > Magic Johnson 20.6 7.5 10.4 1.9 .4
                      > Oscar Robertson 22.4 5.3 8.0 (1.5 .3) .644
                      > Magic was "the next Oscar", and then some.
                      > (Grant Hill, Payton, Penny, Strickland, Isiah, Drexler, KJ, Frazier)
                      >
                      > John Stockton 17.1 3.3 11.9 2.4 .2
                      > Isiah Thomas 18.0 3.7 8.8 2.0 .3 .543
                      > Stockton is just a giant in the assists category.
                      > (Tim Hardaway, KJ, Strickland, Cousy, Kenny Anderson, Brandon)
                      >
                      > Jerry West 25.1 4.2 6.0 (2.7 .9)
                      > Allen Iverson 25.1 3.9 5.5 2.1 .2 .517
                      > Now we have some real across-the-board similarity.
                      > (Barry, Penny, Kobe, Drexler, Maravich, Oscar, Westphal)
                      >
                      > Oscar Robertson 22.4 5.3 8.0 (1.5 .3)
                      > Penny Hardaway 20.2 5.1 6.2 1.9 .6 .486
                      > (KJ, Payton, Frazier, Cassell, Tim Hardaway, Price, Brandon, Magic)
                      >
                      > Moses Malone 21.6 13.2 1.3 .9 1.4
                      > Shawn Kemp 20.9 11.8 2.2 1.4 1.6 .470
                      > (Parish, Gilmore, Reed, McDyess, Ewing, Hayes, Haywood, McAdoo)
                      >
                      > Shaquille O'Neal 29.7 12.7 2.8 .7 2.6
                      > Tim Duncan 25.1 12.0 3.0 .8 2.3 .466
                      > (Kareem, Robinson, Mikan, Pettit, Ewing, Mourning, Wilt, Hakeem)
                      >
                      > Artis Gilmore 20.3 11.9 2.3 .6 2.3
                      > Patrick Ewing 23.5 11.1 2.0 1.0 2.6 .446
                      > (Hayes, Parish, Derrick Coleman, Sabonis, McDyess, Kemp, Gallatin)
                      >
                      > The remainder of the top 31 (and their closest match)
                      >
                      > Kareem AbdulJab. 25.9 10.6 3.4 1.0 2.7
                      > Tim Duncan 25.1 12.0 3.0 .8 2.3 .288
                      > (Robinson, Pettit, Mikan, Ewing, Neil Johnston, Shaq, Hakeem)
                      >
                      > Wilt Chamberlain 23.5 14.7 3.5 (1.5 3.0)
                      > George Mikan 24.8 13.1 2.9 (1.3 2.0) .432
                      > (Hakeem, Robinson, Duncan, Pettit, Kareem, Ewing)
                      >
                      > Karl Malone 28.1 11.2 3.4 1.4 .8
                      > Charles Barkley 24.2 12.4 3.8 1.6 .8 .444
                      > (Pettit, Johnston, Mikan, Baylor, Jeff Ruland, Bird, Duncan, McAdoo)
                      >
                      > Hakeem Olajuwon 23.7 11.7 2.6 1.8 3.2
                      > David Robinson 26.1 11.8 2.8 1.5 3.3 .275
                      >
                      > Julius Erving 23.0 7.8 4.0 1.9 1.7
                      > Elgin Baylor 22.5 9.6 3.9 (1.6 1.5) .347
                      > (Webber, Marques Johnson, Shareef, Johnston, Lanier, Ed Macauley,
                      > Schayes, Garnett, Bird, Drexler)
                      >
                      > Patrick Ewing 23.5 11.1 2.0 1.0 2.6
                      > Alonzo Mourning 24.5 10.9 1.6 .7 3.2 .332
                      >
                      > Bob Pettit 24.2 11.7 2.8 (1.3 1.8)
                      > George Mikan 24.8 13.1 2.9 (1.3 2.0) .231
                      >
                      > Elgin Baylor 22.5 9.6 3.9 (1.6 1.5)
                      > Chris Webber 21.1 10.1 4.2 1.5 1.8 .215
                      > (Lanier, Erving, Schayes, Johnston, Shareef, Garnett, Pettit,
                      McAdoo)
                      >
                      > Scottie Pippen 18.4 7.5 5.4 2.1 .9
                      > Clyde Drexler 20.6 6.7 5.5 2.1 .7 .306
                      > (Alvan Adams, Connie Hawkins, Toni Kukoc, Billy C., Grant Hill,
                      > Antoine Walker, Marques Johnson, Penny, Cliff Hagan)
                      >
                      > Clyde-Scottie likewise
                      >
                      > Robert Parish 18.1 11.4 1.5 .9 1.8
                      > Elvin Hayes 17.8 10.9 1.7 1.0 2.6 .161
                      > (Gallatin, McDyess, Seikaly, Reed, Larry Foust, Dan Roundfield,
                      > Sampson, Haywood, Brian Grant)
                      >
                      > Bob Lanier 21.4 10.5 3.3 1.2 1.7
                      > Dolph Schayes 20.0 10.1 3.1 (1.4 1.6) .194
                      >
                      > (Elvin Hayes-Robert Parish match)
                      >
                      > Rick Barry 21.9 5.5 4.5 2.1 .5
                      > Kobe Bryant 23.0 5.2 4.2 1.4 .8 .345
                      > (Chris Mullin, Drexler, Hagan, Moncrief, Penny, Ray Allen)
                      >
                      > Kevin McHale 22.1 8.6 1.8 .4 2.0
                      > Rik Smits 19.9 8.3 1.8 .6 1.6 .306
                      > (Lovellete, Darryl Dawkins, Haywood, McAdoo, Yardley, McDyess)
                      >
                      > (George Mikan-Bob Pettit)
                      >
                      > Dan Issel 21.1 8.5 2.2 1.1 .6
                      > Terry Cummings 19.1 9.3 2.2 1.3 .7 .280
                      > (Chambers, Ceballos, Calvin Natt, Shareef, Yardley, Glenn Robinson)
                      >
                      > Clearly, as one goes down the list into more "ordinary" players,
                      > there is a proliferation of close profiles.
                      >
                      >
                      > Mike Goodman
                      >
                      > > >
                      > > >
                    • deano@tsoft.com
                      ... Yes and No. What we re trying to come up with here is a general set of rules that can be applied at default (as a basis for studies, that can be
                      Message 10 of 16 , Sep 16, 2001
                      • 0 Attachment
                        --- In APBR_analysis@y..., harlanzo@y... wrote:
                        > It occurred to me that when comparing players through their
                        > statistics should we be weighting the comparisons so that some
                        > statistics are more important based on positions?

                        Yes and No. What we're trying to come up with here is a general set
                        of rules that can be applied at default (as a basis for studies,
                        that can be modified). James always said that the method's blessing
                        and curse was its flexibility. We SHOULD modify it for specific
                        comparisons -- perhaps among point guards. There will always be a
                        lot of different versions around, but we want one set for general
                        comparisons, in part because, using your example, we can't
                        necessarily identify who point guards are.

                        I also thought of a reason not to use Euclidean distance -- it
                        weights big differences too much. At least that is the subjective
                        opinion a lot of times. It's the old argument between standard
                        deviation and mean absolute difference -- the first weights big
                        differences a lot but is mathematically easier, but the second seems
                        to reflect more of what we want. The similarity scores, as James did
                        them and as I modified them, fit into the mean absolute difference
                        category. In Mike's categories, then, this implies that there is
                        likely one very big difference between Jordan's numbers and everyone
                        else (probably scoring average) -- that gets emphasized, making him
                        the most unique player. I'd like to take a stab at career similarity
                        scores using the approach I've outlined to see whether it id's Jordan
                        as most unique, too.

                        MikeG -- While I like the comparisons you did, there are 2 comments I
                        would make:

                        1. I'd like to see some non-standardized comparisons. I do like the
                        standardized because they make some sense, but I think
                        non-standardized will also tell a story.

                        2. You really need some comparison of shooting percentages and
                        turnovers. It really caught my eye with the Duncan-Kareem
                        comparison. I see some similarity between these two, but there are
                        big differences in offensive efficiency. Kareem was nearly
                        unstoppable offensively - my floor%'s and offensive efficiencies
                        reflect that. Duncan is very stoppable, his offensive rating and
                        floor percentage blending in to be about average. Kareem fell to
                        average offensively only in his last year. (I also don't think that
                        Kareem was the defensive force that Duncan is, but my memories are
                        biased by the Kareem post-'80, when he wasn't as good as he was when
                        younger.)

                        Dean Oliver
                        Journal of Basketball Studies


                        > For example,
                        when
                        > comparing point guards the assist category might be more important
                        > for weighing similarity than rebound category. Conversely, do we
                        > really care whether two centers have similar assist numbers if
                        their
                        > points, rebounds, and fg % are similar? I think this sounds
                        somewhat
                        > right with some notable exceptions. The counter argument of course
                        > is that centers who pass well (a la Walton) or shoot 3s well
                        > (Laimbeer and Sikma) are unique and the similarity scores will help
                        > identify players with similar rare skill sets. (To digress, I
                        wonder
                        > if Jason Kidd and some of the Darrell Walker early 90s seasons are
                        > comparable). I am beginning to babble but I think that the
                        question
                        > I am asking is whether positional demands should change how we
                        weight
                        > statistical categories when we try to apply similarity scores?
                        >
                        >
                        > --- In APBR_analysis@y..., "Mike Goodman" <msg_53@h...> wrote:
                        > > --- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...>
                        wrote:
                        > > >.... Euclidean distance,
                        > > > sqrt( X^2 + Y^2 + Z^2 + ...) where X, Y, Z, etc. are the
                        > difference
                        > > > between, say, Magic Johnson and Larry Bird in whatever
                        variables
                        > we
                        > > choose
                        > > > to look at.
                        > > >
                        > > > But there are problems with Euclidean distance, specfically one
                        > that
                        > > > Dean Oliver alludes to: some variables are redundant or
                        > > > partially redundant with each other,
                        > > > e.g. FG Made and Points Scored, or even Off Rebds and Def
                        Rebds.
                        > > Another
                        > > > problem is that not all variables are equally important: some
                        > > probably
                        > > > should be given greater weight than others ...
                        > >
                        > > I tried my hand at a variation of the Euclidian distance, since I
                        > can
                        > > understand the formula (and pronounce it, too).
                        > > I took 5 stats: scoring, rebounding, assists, steals, blocks. I
                        > used
                        > > my normalized (standardized) versions. Because points are much
                        > more
                        > > abundant than, say, steals, I reduced this difference by taking
                        the
                        > > square root of each stat. I compared the top 31 players on my
                        > > infamous "alltime" list to the other 514 in the list. (I
                        actually
                        > > ran out of columns in Excel, for the first time.)
                        > > The formula is drudgery to type, but it starts like this:
                        > > E = (sqrt(a1)-sqrt(b1))^2 + (sqrt(a2)-sqrt(b2))^2 +... and so on,
                        > up
                        > > to a5 and b5, for players a and b, and variables 1-5.
                        > > I did not take the square root of the whole thing, since
                        everything
                        > > was already square-rooted once.
                        > > Not surprisingly, the best players only correspond to other great
                        > > players, but some players have much more unique statistical
                        > profiles.
                        > > In order of "greatest distance from the next-closest profile", we
                        > > have:
                        > > Sco Reb Ast Stl Blk E
                        > > Michael Jordan 33.5 6.5 5.1 2.3 .9
                        > > Jerry West 25.1 4.2 6.0 (2.7 .9) .945 (estimated)
                        > > No real surprise that Jordan is the "most unique" statistically.

                        > > Others scored more than West, but didn't have quality numbers
                        > beyond
                        > > that.
                        > > (Iverson is next, then Karl Malone(!), Kobe, Gervin, Erving,
                        Bird,
                        > > Wilkins, Dantley, Barry)
                        > >
                        > > Bill Russell 11.8 14.6 3.8 (1.5 4.0)
                        > > Bill Walton 15.9 12.8 4.0 1.0 2.7 .743
                        > > Really not very similar, but as close as anyone comes to
                        Russell's
                        > > combination of skills.
                        > > (Thurmond is close 2nd, then Sam Lacey, Elmore Smith, Mutombo)
                        > >
                        > > Magic Johnson 20.6 7.5 10.4 1.9 .4
                        > > Oscar Robertson 22.4 5.3 8.0 (1.5 .3) .644
                        > > Magic was "the next Oscar", and then some.
                        > > (Grant Hill, Payton, Penny, Strickland, Isiah, Drexler, KJ,
                        Frazier)
                        > >
                        > > John Stockton 17.1 3.3 11.9 2.4 .2
                        > > Isiah Thomas 18.0 3.7 8.8 2.0 .3 .543
                        > > Stockton is just a giant in the assists category.
                        > > (Tim Hardaway, KJ, Strickland, Cousy, Kenny Anderson, Brandon)
                        > >
                        > > Jerry West 25.1 4.2 6.0 (2.7 .9)
                        > > Allen Iverson 25.1 3.9 5.5 2.1 .2 .517
                        > > Now we have some real across-the-board similarity.
                        > > (Barry, Penny, Kobe, Drexler, Maravich, Oscar, Westphal)
                        > >
                        > > Oscar Robertson 22.4 5.3 8.0 (1.5 .3)
                        > > Penny Hardaway 20.2 5.1 6.2 1.9 .6 .486
                        > > (KJ, Payton, Frazier, Cassell, Tim Hardaway, Price, Brandon,
                        Magic)
                        > >
                        > > Moses Malone 21.6 13.2 1.3 .9 1.4
                        > > Shawn Kemp 20.9 11.8 2.2 1.4 1.6 .470
                        > > (Parish, Gilmore, Reed, McDyess, Ewing, Hayes, Haywood, McAdoo)
                        > >
                        > > Shaquille O'Neal 29.7 12.7 2.8 .7 2.6
                        > > Tim Duncan 25.1 12.0 3.0 .8 2.3 .466
                        > > (Kareem, Robinson, Mikan, Pettit, Ewing, Mourning, Wilt, Hakeem)
                        > >
                        > > Artis Gilmore 20.3 11.9 2.3 .6 2.3
                        > > Patrick Ewing 23.5 11.1 2.0 1.0 2.6 .446
                        > > (Hayes, Parish, Derrick Coleman, Sabonis, McDyess, Kemp,
                        Gallatin)
                        > >
                        > > The remainder of the top 31 (and their closest match)
                        > >
                        > > Kareem AbdulJab. 25.9 10.6 3.4 1.0 2.7
                        > > Tim Duncan 25.1 12.0 3.0 .8 2.3 .288
                        > > (Robinson, Pettit, Mikan, Ewing, Neil Johnston, Shaq, Hakeem)
                        > >
                        > > Wilt Chamberlain 23.5 14.7 3.5 (1.5 3.0)
                        > > George Mikan 24.8 13.1 2.9 (1.3 2.0) .432
                        > > (Hakeem, Robinson, Duncan, Pettit, Kareem, Ewing)
                        > >
                        > > Karl Malone 28.1 11.2 3.4 1.4 .8
                        > > Charles Barkley 24.2 12.4 3.8 1.6 .8 .444
                        > > (Pettit, Johnston, Mikan, Baylor, Jeff Ruland, Bird, Duncan,
                        McAdoo)
                        > >
                        > > Hakeem Olajuwon 23.7 11.7 2.6 1.8 3.2
                        > > David Robinson 26.1 11.8 2.8 1.5 3.3 .275
                        > >
                        > > Julius Erving 23.0 7.8 4.0 1.9 1.7
                        > > Elgin Baylor 22.5 9.6 3.9 (1.6 1.5) .347
                        > > (Webber, Marques Johnson, Shareef, Johnston, Lanier, Ed Macauley,
                        > > Schayes, Garnett, Bird, Drexler)
                        > >
                        > > Patrick Ewing 23.5 11.1 2.0 1.0 2.6
                        > > Alonzo Mourning 24.5 10.9 1.6 .7 3.2 .332
                        > >
                        > > Bob Pettit 24.2 11.7 2.8 (1.3 1.8)
                        > > George Mikan 24.8 13.1 2.9 (1.3 2.0) .231
                        > >
                        > > Elgin Baylor 22.5 9.6 3.9 (1.6 1.5)
                        > > Chris Webber 21.1 10.1 4.2 1.5 1.8 .215
                        > > (Lanier, Erving, Schayes, Johnston, Shareef, Garnett, Pettit,
                        > McAdoo)
                        > >
                        > > Scottie Pippen 18.4 7.5 5.4 2.1 .9
                        > > Clyde Drexler 20.6 6.7 5.5 2.1 .7 .306
                        > > (Alvan Adams, Connie Hawkins, Toni Kukoc, Billy C., Grant Hill,
                        > > Antoine Walker, Marques Johnson, Penny, Cliff Hagan)
                        > >
                        > > Clyde-Scottie likewise
                        > >
                        > > Robert Parish 18.1 11.4 1.5 .9 1.8
                        > > Elvin Hayes 17.8 10.9 1.7 1.0 2.6 .161
                        > > (Gallatin, McDyess, Seikaly, Reed, Larry Foust, Dan Roundfield,
                        > > Sampson, Haywood, Brian Grant)
                        > >
                        > > Bob Lanier 21.4 10.5 3.3 1.2 1.7
                        > > Dolph Schayes 20.0 10.1 3.1 (1.4 1.6) .194
                        > >
                        > > (Elvin Hayes-Robert Parish match)
                        > >
                        > > Rick Barry 21.9 5.5 4.5 2.1 .5
                        > > Kobe Bryant 23.0 5.2 4.2 1.4 .8 .345
                        > > (Chris Mullin, Drexler, Hagan, Moncrief, Penny, Ray Allen)
                        > >
                        > > Kevin McHale 22.1 8.6 1.8 .4 2.0
                        > > Rik Smits 19.9 8.3 1.8 .6 1.6 .306
                        > > (Lovellete, Darryl Dawkins, Haywood, McAdoo, Yardley, McDyess)
                        > >
                        > > (George Mikan-Bob Pettit)
                        > >
                        > > Dan Issel 21.1 8.5 2.2 1.1 .6
                        > > Terry Cummings 19.1 9.3 2.2 1.3 .7 .280
                        > > (Chambers, Ceballos, Calvin Natt, Shareef, Yardley, Glenn
                        Robinson)
                        > >
                        > > Clearly, as one goes down the list into more "ordinary" players,
                        > > there is a proliferation of close profiles.
                        > >
                        > >
                        > > Mike Goodman
                        > >
                        > > > >
                        > > > >
                      • msg_53@hotmail.com
                        Personally, I don t ever consider position to be a quantifiable statistic. Many forwards have been forced to play center; many forwards are not clearly
                        Message 11 of 16 , Sep 16, 2001
                        • 0 Attachment
                          Personally, I don't ever consider 'position' to be a quantifiable
                          statistic. Many forwards have been forced to play center; many
                          forwards are not clearly 'power' or 'small' forwards; many players
                          are not exclusively guards or forwards; many versatile guards do
                          plenty of scoring and passing, and rebounding.
                          The possible fragmenting of these lists is virtually infinite. An
                          assist from a center is exactly as important as an assist from a
                          guard. A rebounding guard, a center who gets steals as well as
                          blocks, all these things make a player unique, or at least
                          differentiate him from the norm.
                          The issue of 3-point shooting might be worth looking into. How one
                          goes about racking up one's scoring totals is of some interest. Then
                          again, it might invite breaking down points into dunks, layups, etc.
                          In the end, points are points. A player's scoring may come from
                          inside moves when he is young, and from outside shots later. The
                          contribution is still the same.
                          One thing these similarity indexes do reveal, is that there are
                          some 'classic' profiles by position. Wilt, Kareem, Hakeem, Shaq,
                          Robinson, Ewing, Moses, Gilmore, all averaged 22-28 pts, 12-15 reb, 2-
                          3 blocks. But the well-rounded centers seem to have enjoyed more
                          success.
                          The demands of one's position are somewhat situational. The best
                          players can usually do whatever is most needed.

                          --- In APBR_analysis@y..., harlanzo@y... wrote:
                          > It occurred to me that when comparing players through their
                          > statistics should we be weighting the comparisons so that some
                          > statistics are more important based on positions? For example,
                          when
                          > comparing point guards the assist category might be more important
                          > for weighing similarity than rebound category. Conversely, do we
                          > really care whether two centers have similar assist numbers if
                          their
                          > points, rebounds, and fg % are similar? I think this sounds
                          somewhat
                          > right with some notable exceptions. The counter argument of course
                          > is that centers who pass well (a la Walton) or shoot 3s well
                          > (Laimbeer and Sikma) are unique and the similarity scores will help
                          > identify players with similar rare skill sets. (To digress, I
                          wonder
                          > if Jason Kidd and some of the Darrell Walker early 90s seasons are
                          > comparable). I am beginning to babble but I think that the
                          question
                          > I am asking is whether positional demands should change how we
                          weight
                          > statistical categories when we try to apply similarity scores?
                          >
                          >
                        • msg_53@hotmail.com
                          ... seems ... I operate under the assumption that points and rebounds are equally important as contributions; so are steals and blocks, but almost everyone
                          Message 12 of 16 , Sep 16, 2001
                          • 0 Attachment
                            --- In APBR_analysis@y..., deano@t... wrote:
                            >..... a reason not to use Euclidean distance -- it
                            > weights big differences too much. At least that is the subjective
                            > opinion a lot of times. It's the old argument between standard
                            > deviation and mean absolute difference -- the first weights big
                            > differences a lot but is mathematically easier, but the second
                            seems
                            > to reflect more of what we want.

                            I operate under the assumption that points and rebounds are equally
                            important as contributions; so are steals and blocks, but almost
                            everyone gets fewer than 2-3 of these, so it seems fair to weigh them
                            less. Taking the standard deviation from the mean gives you the
                            burden of assigning a weight to the statistical category. I avoid
                            this by presuming that bigger numbers implies bigger weights. That
                            is, scoring is and should be more important than, say, steals.
                            (I did reduce the 'difference' factor by taking their square roots.)

                            > The similarity scores, as James did
                            > them and as I modified them, fit into the mean absolute difference
                            > category. In Mike's categories, then, this implies that there is
                            > likely one very big difference between Jordan's numbers and
                            everyone
                            > else (probably scoring average) -- that gets emphasized, making him
                            > the most unique player. I'd like to take a stab at career
                            similarity
                            > scores using the approach I've outlined to see whether it id's
                            Jordan
                            > as most unique, too.
                            >
                            > MikeG -- While I like the comparisons you did, there are 2 comments
                            I
                            > would make:
                            >
                            > 1. I'd like to see some non-standardized comparisons. I do like
                            the
                            > standardized because they make some sense, but I think
                            > non-standardized will also tell a story.

                            Dean, you could do raw averages, but players from the 60s would only
                            compare to players in the 60s. Actually, a great rebounder in the
                            90s would seem to compare to an average rebounder in the 60s, for
                            example.
                            I don't have a ready database of raw averages.

                            > 2. You really need some comparison of shooting percentages and
                            > turnovers. It really caught my eye with the Duncan-Kareem
                            > comparison. I see some similarity between these two, but there are
                            > big differences in offensive efficiency. Kareem was nearly
                            > unstoppable offensively - my floor%'s and offensive efficiencies
                            > reflect that. Duncan is very stoppable, his offensive rating and
                            > floor percentage blending in to be about average. Kareem fell to
                            > average offensively only in his last year. (I also don't think
                            that
                            > Kareem was the defensive force that Duncan is, but my memories are
                            > biased by the Kareem post-'80, when he wasn't as good as he was
                            when
                            > younger.)
                            >
                            > Dean Oliver
                            > Journal of Basketball Studies

                            Shooting percentages are part of what determines my standardized
                            scoring rate, along with game pace (defined as points allowed). I
                            only did career totals, so Kareem's incredibly long career has been
                            smoothed over, and his very dominant early seasons are not truly
                            reflected. Maybe Duncan has peaked, and his career averages really
                            won't rank close to Kareem's.
                            Further, Duncan's offensive numbers, in my system, get a big boost
                            from his being on a great defensive team. You have to agree his
                            offensive strength is way above average on his team. In other words,
                            the go-to guy on the championship Spurs is going to rate favorably to
                            the go-to guy on the champion Bucks from 30 years before, in my
                            system.

                            Mike Goodman
                            >
                            >
                            > > > > >
                          • Dean Oliver
                            ... only ... I think this is what I was interested in. I was curious who from today would fit in the 60 s. Or, more interestingly, who from the 70 s might
                            Message 13 of 16 , Sep 17, 2001
                            • 0 Attachment
                              --- In APBR_analysis@y..., msg_53@h... wrote:
                              > > 1. I'd like to see some non-standardized comparisons. I do like
                              > the
                              > > standardized because they make some sense, but I think
                              > > non-standardized will also tell a story.
                              >
                              > Dean, you could do raw averages, but players from the 60s would
                              only
                              > compare to players in the 60s. Actually, a great rebounder in the
                              > 90s would seem to compare to an average rebounder in the 60s, for
                              > example.
                              > I don't have a ready database of raw averages.
                              >

                              I think this is what I was interested in. I was curious who from
                              today would fit in the '60's. Or, more interestingly, who from the
                              '70's might fit in today's game. Are West's raw #'s similar to
                              Iverson's or to Richmond's? What happens in baseball is that
                              outstanding players tend to be dissimilar to other players in their
                              era, but similar to outstanding players of other eras. I have doubt
                              that this would happen in basketball, using raw #'s, because of the
                              style change. You seem to be saying the same thing.

                              (I didn't realize that you don't have a db of raw#'s.)

                              > > 2. You really need some comparison of shooting percentages and
                              > > turnovers. It really caught my eye with the Duncan-Kareem
                              > > comparison. I see some similarity between these two, but there
                              are
                              > > big differences in offensive efficiency. Kareem was nearly
                              > > unstoppable offensively - my floor%'s and offensive efficiencies
                              > > reflect that. Duncan is very stoppable, his offensive rating and
                              > > floor percentage blending in to be about average. Kareem fell to
                              > > average offensively only in his last year. (I also don't think
                              > that
                              > > Kareem was the defensive force that Duncan is, but my memories
                              are
                              > > biased by the Kareem post-'80, when he wasn't as good as he was
                              > when
                              > > younger.)
                              >
                              > Shooting percentages are part of what determines my standardized
                              > scoring rate, along with game pace (defined as points allowed). I
                              > only did career totals, so Kareem's incredibly long career has been
                              > smoothed over, and his very dominant early seasons are not truly
                              > reflected.

                              One of my personal quibbles with all the tendex-like rating systems
                              out there is there is that they do combine offensive with defensive
                              contributions. There is a big difference in my mind between Moses
                              Malone, who was an offensive force, and Hakeem Olajuwon, who has been
                              dominant defensively. Both were good in the other thing, but
                              dominant in just one. Kareem was dominant offensively (and probably
                              defensively) early on. Duncan has been dominant defensively, not
                              offensively. (Duncan appears to have more of the competitive fight
                              than Kareem, but, again, I missed the early Kareem.)

                              > Maybe Duncan has peaked, and his career averages really
                              > won't rank close to Kareem's.

                              I don't think I'd say that Duncan's peaked. He's been pretty
                              remarkably consistent since entering the league. Maybe it's only
                              remarkable that he stayed in school long enough to actually be ready
                              for the league when entering.

                              > Further, Duncan's offensive numbers, in my system, get a big boost
                              > from his being on a great defensive team. You have to agree his
                              > offensive strength is way above average on his team.

                              Depending on how you define "average", but, yeah, Duncan looks better
                              offensively than he really is because he plays on a great defensive
                              team. (He would make most teams better defensively, too.)

                              > Personally, I don't ever consider 'position' to be a quantifiable
                              > statistic.

                              James defined numbers to positions for defensive purposes (a
                              shortstop is much more valuable to a defense than a 1st baseman, for
                              example). That might be necessary for some of the older guys because
                              defensive stats really don't exist in the '60's and early '70's. But
                              we can probably still assume that a center was the most important
                              defensive player back then, as he is now. This gets adequately
                              reflected in blocks, steals, and defensive boards, but you do need
                              those #'s.

                              > assist from a center is exactly as important as an assist from a
                              > guard.

                              Only a minor point here -- this is not precisely true (though
                              probably true enough for government work). Assists from guards tend
                              to be more valuable. This is because they often have to make the
                              tougher pass than big men. The weight on an assist is proportional
                              to the expected FG% of the guy he passes to. Historically, big men
                              have had higher FG% than guards -- hence their assists are weighted
                              less. (The assists of the best shooting player on a team are less
                              valuable than the assists of the guys getting him the ball.) This
                              has changed with the 3 pt shot, but it's a conversion from FG% to
                              effective FG%...

                              Dean Oliver
                              Journal of Basketball Studies
                            • Mike Goodman
                              ... My raw totals and per-game averages are contained in my season files, along with team totals and averages for that season. My composite lists only have
                              Message 14 of 16 , Sep 18, 2001
                              • 0 Attachment
                                --- In APBR_analysis@y..., "Dean Oliver" <deano@t...> wrote:
                                > (I didn't realize that you don't have a db of raw#'s.)
                                >
                                My raw totals and per-game averages are contained in my 'season'
                                files, along with team totals and averages for that season. My
                                composite lists only have the 'standardized' rates. From those
                                rates, I can generate 'equivalent totals'. For 'average'
                                scoring/rebounding teams, these would be equal to raw season totals.

                                >
                                > One of my personal quibbles with all the tendex-like rating systems
                                > out there is there is that they do combine offensive with defensive
                                > contributions. There is a big difference in my mind between Moses
                                > Malone, who was an offensive force, and Hakeem Olajuwon, who has
                                been
                                > dominant defensively. Both were good in the other thing, but
                                > dominant in just one. Kareem was dominant offensively (and
                                probably
                                > defensively) early on. Duncan has been dominant defensively, not
                                > offensively. (Duncan appears to have more of the competitive fight
                                > than Kareem, but, again, I missed the early Kareem.)

                                I get your point, Dean, but your examples don't seem the clearest.
                                Olajuwan is better than Malone because he has all the offense Malone
                                had PLUS defense. Never seen the Dream shake?
                                Duncan has virtually all the offense Kareem had, averaged over their
                                careers, according to my numbers. Kareem did maintain a great
                                shooting pct., but Duncan plays in an era of universally-tough D.

                                > I don't think I'd say that Duncan's peaked. He's been pretty
                                > remarkably consistent since entering the league. Maybe it's only
                                > remarkable that he stayed in school long enough to actually be
                                ready
                                > for the league when entering.

                                Some guys enter the league at full strength: Wilt, Oscar, Kareem,
                                Robinson, never improved beyond their first 3 years. Others start as
                                near- superstars, then several years along suddenly shift into true
                                superstar mode: Magic, Bird, Olajuwon, ...

                                >
                                > Depending on how you define "average", but, yeah, Duncan looks
                                better
                                > offensively than he really is because he plays on a great defensive
                                > team. (He would make most teams better defensively, too.)

                                Don't know how a guy 'looks better than he really is', DeanO.

                                >Assists from guards tend
                                > to be more valuable. This is because they often have to make the
                                > tougher pass than big men. The weight on an assist is proportional
                                > to the expected FG% of the guy he passes to. Historically, big men
                                > have had higher FG% than guards -- hence their assists are weighted
                                > less. (The assists of the best shooting player on a team are less
                                > valuable than the assists of the guys getting him the ball.) This
                                > has changed with the 3 pt shot, but it's a conversion from FG% to
                                > effective FG%...
                                >
                                > Dean Oliver
                                > Journal of Basketball Studies

                                This is fun, splitting hairs!
                                If your center kicks out 3 nice passes to guards, who only hit one of
                                the 3 shots, the center only gets one assist.
                                The guard can make 3 nice passes inside, 2 of which may be converted,
                                so he gets 2 assists.
                                So an equally valid argument is that assists from guards
                                are 'easier', and assists from centers are 'undercounted'.
                                I say they are equal.

                                Perhaps more to the issue, evaluate which players make those
                                practical passes which may or may not get them an assist, versus
                                those who will not give up the ball unless it gets them an assist. I
                                can't discern the 2 types from the statistics, but I know it when I
                                see it. (It might be partly discernible in that old assist/turnover
                                ratio.)


                                Mike Goodman
                              • Dean Oliver
                                ... systems ... defensive ... fight ... Olajuwon was very solid offensively (not stellar, like Kareem) -- I didn t mean to imply otherwise. Malone was just
                                Message 15 of 16 , Sep 18, 2001
                                • 0 Attachment
                                  --- In APBR_analysis@y..., "Mike Goodman" <msg_53@h...> wrote:
                                  > > One of my personal quibbles with all the tendex-like rating
                                  systems
                                  > > out there is there is that they do combine offensive with
                                  defensive
                                  > > contributions. There is a big difference in my mind between Moses
                                  > > Malone, who was an offensive force, and Hakeem Olajuwon, who has
                                  > been
                                  > > dominant defensively. Both were good in the other thing, but
                                  > > dominant in just one. Kareem was dominant offensively (and
                                  > probably
                                  > > defensively) early on. Duncan has been dominant defensively, not
                                  > > offensively. (Duncan appears to have more of the competitive
                                  fight
                                  > > than Kareem, but, again, I missed the early Kareem.)
                                  >
                                  > I get your point, Dean, but your examples don't seem the clearest.
                                  > Olajuwan is better than Malone because he has all the offense Malone
                                  > had PLUS defense. Never seen the Dream shake?
                                  > Duncan has virtually all the offense Kareem had, averaged over their
                                  > careers, according to my numbers. Kareem did maintain a great
                                  > shooting pct., but Duncan plays in an era of universally-tough D.

                                  Olajuwon was very solid offensively (not stellar, like Kareem) -- I
                                  didn't mean to imply otherwise. Malone was just the epitome of a good
                                  offensive center who wasn't that good defensively. Rik Smits is
                                  another example of the poor defensive type who can score (not as well
                                  as Olajuwon/Moses). Olajuwon is very DISSIMILAR to these guys because
                                  he is much better defensively. Similarity is all I'm trying to
                                  capture, not quality.

                                  I looked at Duncan's offensive #'s last night and his offensive rating
                                  has been between about 104 and 108 since entering the league, when
                                  average offensive ratings have been between about 100 and 103. He's a
                                  little more efficient than average. My recollection of Kareem's #'s
                                  were about 115 in the early '80s, when average was about 106-108 --
                                  relatively higher than Duncan's. Again, these two players just don't
                                  seem very SIMILAR to me. I would think of David Robinson as more
                                  similar to Kareem. Or possibly Olajuwon. Probably Wilt. Not
                                  Russell.

                                  > >
                                  > > Depending on how you define "average", but, yeah, Duncan looks
                                  > better
                                  > > offensively than he really is because he plays on a great
                                  defensive
                                  > > team. (He would make most teams better defensively, too.)
                                  >
                                  > Don't know how a guy 'looks better than he really is', DeanO.
                                  >

                                  Another way of saying that the hype on Duncan has been a little
                                  extreme. Put him on the Hawks last year and, while he's better than
                                  Mutombo offensively, the team still wouldn't have scored much. They
                                  would have been pretty close to as good defensively as they were with
                                  Mutombo (or better), but they wouldn't be an offensive threat. I
                                  don't think Kareem ever played on a weak offensive team.

                                  > This is fun, splitting hairs!
                                  > If your center kicks out 3 nice passes to guards, who only hit one
                                  of
                                  > the 3 shots, the center only gets one assist.
                                  > The guard can make 3 nice passes inside, 2 of which may be
                                  converted,
                                  > so he gets 2 assists.
                                  > So an equally valid argument is that assists from guards
                                  > are 'easier', and assists from centers are 'undercounted'.
                                  > I say they are equal.
                                  >
                                  > Perhaps more to the issue, evaluate which players make those
                                  > practical passes which may or may not get them an assist, versus
                                  > those who will not give up the ball unless it gets them an assist.

                                  The goal is to identify when a good pass is made. Generally a better
                                  pass is one made to a better shooter. That's all I try to capture. I
                                  capture it in formulas with teammate FG%. For years, I didn't worry
                                  about it and it really didn't matter much. Now I've got more
                                  sophisticated calculation devices. I've actually found that this
                                  adjustment makes the most difference when evaluating different levels
                                  of basketball (high school, college, women's).

                                  Dean Oliver
                                  Journal of Basketball Studies
                                Your message has been successfully submitted and would be delivered to recipients shortly.