- Feb 25, 20041. A Kalman Filter can be used to estimate an offensive ability, a

defensive ability, and an overall rank at the same time -- something

that Sagarin does, Massey does, etc. for the BCS. It basically

estimates how many points an offense should score against some

defense, then readjusts based on the actual results of the game. It

does the same for the defense, then defines power by offense minus

defense or divided by or whatever. (What defines the method as a

Kalman filter is a certain use of different stats. My guess is that

none of these guys use a Kalman filter exactly, but they do

continuously estimate and re-estimate -- they use "filters.")

I think something more like this was done because of some of the

things that were reported about WINVAL -- that they get offensive and

defensive rankings for all players, plus the overall number you have.

Whether it's most "efficient," (an economist's term) I dunno. I know

that there are references that describe the Kalman as "the optimal

linear filter." Basically, my sense is that both regression and

Kalman Filters use variances to optimize their estimates. The Kalman

does it assuming one model of interaction between the offense and the

defense, estimating both the offense and defense. Regression does it

without the interaction modeled, just viewing the point difference as

The Result and I don't think it can easily back out estimates of

offense and defense. There are advantages to both. I personally do

think that the model of interaction between offense and defense in

most filters is not right (doesn't adequately account for garbage

time), but estimating offensive and defensive ability is important.

(more points on a separate message, since this one is long enough and

the others are less technical...)

DeanO

Dean Oliver

Author, Basketball on Paper

http://www.basketballonpaper.com

"Oliver goes beyond stats to dissect what it takes to win. His breezy

style makes for enjoyable reading, but there are plenty of points of

wisdom as well. This book can be appreciated by fans, players,

coaches and executives, but more importantly it can be used as a text

book for all these groups. You are sure to learn something you didn't

know about basketball here." Pete Palmer, co-author, Hidden Game of

Baseball and Hidden Game of Football

--- In APBR_analysis@yahoogroups.com, "dan_t_rosenbaum"

<rosenbaum@u...> wrote:> Whenever I bring up WINVAL, DeanO always brings up Kalman Filters.

> Kalman Filters are not heavily used in economics, so I had to do

> some checking of what they are and when they are most applicable.

>

> In situations with a series of estimates, lots of observations, and

> nearly continuous streams of data pouring in, it often is not

> feasible to completely re-estimate a series of equations (or perhaps

> even one equation) when new data comes in. In these situations, it

> would be ideal to have a technique that optimally combines old

> estimates and new data. My understanding is that this is what the

> Kalman Filter does. It is ideal for situations like navigational

> systems where data is plentiful and continuous, and there is a need

> for a constant updating of estimates.

>

> Estimating team ratings or even player ratings does not results in

> the data overload conditions where the Kalman Filter is ideal, and

> thus there really is no advantage to using it in cases where OLS

> (Ordinary Least Squares) would suffice. This is probably a bit of

> oversimplification, but it appears to me that a Kalman Filter is

> likely to be slightly less efficient than OLS, but much, much faster

> and that is why it is indeed optimal in many situations where data

> is plentiful and continuous and time is of the essence.

>

> But in this case, even in my WINVAL regressions that have 30,000

> observations and more than 300 variables, the regressions only take

> seconds to run, so there really is no need to use a Kalman Filter.

>

> In an e-mail, DeanO said "a full Kalman Filter looks at the full

> relationship across individuals. I am not sure you did that. In

> other words, the process I describe in the book, where every segment

> of every game is estimated then re-evaluated (a team of a, b, c, d,

> and e vs f, g, h, I, and j on team 2's home court should win 5

> minutes by 4 but loses by 2, lowers the ratings of all 5, with

> incremental replacements of each player). It's possible you did

> that -- it's just a bunch of matrix inversion. A BUNCH of

> inversion, since that matrix is pretty decently (and irregularly)

> populated and it's probably 10,000... oh, heck."

>

> Yes, the OLS regression that I run does account for all of this.

> After multiplying a 300x30,000 matrix by a 30,000x300 matrix, it

> inverts the resulting 300x300 matrix. Then it multiplies this times

> a 300x1 matrix that was the result of multiplying a 300x30,000

> matrix times 30,000x1 vector. And it does all of this in less than

> six seconds in SAS.

>

> The beta estimates are picking up the effect of adding a certain

> player (versus a replacement player) holding the quality of the

> other players on the floor constant. - << Previous post in topic Next post in topic >>