--- In APBR_analysis@y..., "Michael K. Tamada" <tamada@o...> wrote:
> > Interesting that you raise this example. It was the example I used in
> > proposing a PhD topic in environmental policy many years ago. How do you
> > assign/credit blame for things? I came up with a probabilitistic model
> > that I'll explain briefly with the 2 person example -- the drunk
> > the guy who got hit. On a normal night, let's say that there is a 0.01%
> > chance of getting hit by a drunk driver. In that case, the drunk
> > 99.99% to blame for the accident. Let's say it's a night of a big win by
> > the home football team, so the chances of getting hit by a drunk
> > now, say, 5%. Then the drunk is only 95% to blame for the accident. The
> Hmm, but even in the simple case when there are only two parties: those
> probabilities are the ones from the point of view of the victim. There's
In fact, you actually do have to include the probability of any person just
going out and driving (or being in a car) to be proper with the method.
> also the probability of driving around drunk and getting involved in an
> accident -- maybe say 10%, somewhat higher on big football nights. In
> this case this is a person most of us would agree was at fault, so maybe
> we can ignore that 10% and concentrate on the .01% or 5%. But for
> ordinary accidents, or events where the culpability is less clear, which
> probability do we use? From whose point of view? The .01% or the 10%?
You end up looking at the events that were involved. Did one person cross
a center line? Did one person fall asleep? Was one person driving
completely within the law (what are the odds of that)? If everyone speeds,
how much can we say that speeding actually contributes to accidents?
> Came out about 2-3 weeks ago, Rob Neyer mentioned in his espn.com column
> that we was reading it. I have not seen many reviews of it yet.
I saw a couple at Amazon. One review didn't like it. One did.
> > > So, when comparing players' careers, the long-lasting non-spectacular
> > > player will have too much of an advantage over the
> > > shorter-lasting-but-better player. This is precisely where value over
> > > replacement player comes in so handy.
> > Actually, James handled this. He made some comparison of Dimaggio to
> > Staub. The two have similar career numbers of win-shares, but
> > higher numbers at every age of his career, missing time due to the war
> > having a shorter career. So James did something like summing the
> > win-shares over 5 consecutive years or adding the top 3 season
> > values. It's a sensible but not unique way to handle it.
> Yes, that's one of my big problems with win-shares: because it so
> obviously over-rewards longevity, instead of excellence, James has to
> dream up those two ad hoc supplementary measures (5 best consecutive years
> and 3 peak years) to make up for it. And I'm not sure if he truly used
> all three measures in a formula or just looked at all three and made a
> subjective seat-of-the-pants judgement on what they "should" add up to.
> To his credit he did explicity mention that he also had a final subjective
> factor, unquantifiable and subjective -- that part is fine, I think any
> ranking of players should include those subjective adjustments. But win
> shares is clearly inadequate as a quantitative measure because even before
> he reaches that subjective final adjustment, he has to make made-up,
> non-unique, ad hoc, out-of-thin-air adjustments such as his 5-consecutive
> and 3-peak measures.
Having been in consulting for so long, I guess I'm used to hearing about
problems with kluged solutions....
> A truly valid quantitative measure, such as value over replacement, has
> the correct balance between longevity and excellence already built in.
What is this solution? You just say that the replacement value player
contributes, say, 10 win-shares per season (or it varies through time,
whatever), then subtract that off of every year that the players are in the
league before summing? I know James has done that before on other
studies. I wonder why he wouldn't have done it here.
> > > The bits of the formula for win shares that he reveals in the
> > > Abstract look rather weird, some stuff about counting marginal
> > > contributions (he does use the word "marginal") but he defines
> > > some weird way, something like only counting runs above half the
> > > something bizarre like that.
> > I saw that quick explanation he did with teams. I filed it away to be
> > thought of later. I slept on it and I'm pretty sure I know what he
> > think that counting runs above half the mean works out to be something
> > close to a Taylor series approximation to his Pythagorean formula. That
> Hmm, I'm not seeing it. If we let x=our team's runs and y=other team's
> runs and p=probability that we win, then after plugging x and y into the
> Pythagorean formula, for a .500 team (i.e. x=y), I get dp/dx = 1/(2y).
> There's a 1/2 in there all right, but also a 1/y term. Well that was a
> continuous calculation, truthfully runs don't come in infinitesimals and I
> haven't tried to do the finite calculation.
Yeah, well sleeping on it gave me a hint. I can't say I've worked out the
math. First order taylor is like working on the margin of a more complex
function. That seems like what he's doing.
> Yes. One thing that I did read, in rec.sport.baseball, is that he
> deliberately chose a win share to represent about 1/3 of a win. Why not
> make a win share equal 1 win, and a fraction of a win share represent a
> fraction of a win? The readers' explanation of James' reasoning sounded
> good to me: if he did that, he'd have to report a lot of .7 or .3 or 2.2
> win share numbers, and he doesn't believe there's that much precision in
> his model or his calculations. By inflating win shares by a factor of
> three (so that one win share represents only 1/3 of a win), he can,
> instead of reporting .7 or .3 or 2.2, report numbers like 2, 1, and 7.
> One significant figure of accuracy, which is all that he feels he could
> legitimately report.
I thought of that and I also thought that he runs into the same thing I run
into -- duplicative methods that all generate some representation of
wins. I have a few different methods that look at the concept in different
ways. I call them all win-loss records of different types and it's
confusing. James already has win-loss records. This new method somewhat
replaces those others. Mine are at least all pretty different concepts.