Loading ...
Sorry, an error occurred while loading the content.

4162Re: (unknown)

Expand Messages
  • Dean Oliver
    May 31, 2004
      --- In APBR_analysis@yahoogroups.com, igor eduardo küpfer
      <edkupfer@r...> wrote:
      > Okay, I ran the test again, this time using 03-04 results. Before I
      show you
      > what I got, let me address a couple of things.
      > Dean Oliver wrote:
      > > --- In APBR_analysis@yahoogroups.com, igor eduardo küpfer
      > > <edkupfer@r...> wrote:
      > >>>> I'm not quite sure what endogenous means. If it means being
      > >
      > > I should note that, not being an economist, I like throwing this word
      > > around without as great an appreciation or understanding for it as I
      > > should.
      > Hell, that's nothing. Once during the course of an argument with an
      > ex-girlfriend I used the word "heretofore." I still don't know what it
      > means.

      I've had those moments, often inspired by arguments with soon-to-be
      ex-girlfriends. What the hell is "vis-a-vis"?

      > >
      > >> You'll have to help me out here, as I don't know anything about
      > >> transforming data. Do you mean include Days^2 in addition to Days or
      > >> instead of Days? I did both, and here's how they turned out:
      > >
      > > Include both, which you did in the first set below. Looks like it got
      > > you significant on both days and days^2. And the signs are as
      > > expected. It suggests optimal rest at about 3 days, longer than the 2
      > > days we saw before. (Potentially important for the talk about rust vs
      > > rest, esp if the Lakers wrap up on M.)
      > Questions: I don't understand a couple of things about the squared
      term. How
      > did you know that squaring the Days variable would give a better
      fit? And,
      > just exactly how does it suggest the optimal 3 day rest?

      I didn't _know_ it would give a better fit. I hoped it would because
      of what we were observing -- that there was an optimal number of days
      off. The only way to get an optimum out of a regression is to throw
      in higher order terms. Usually a squared term is plenty. It doesn't
      answer the bigger question of whether teams get rusty, though. It
      suggests an answer (another lesson in how to lie with statistics), one
      that I wouldn't trust from this study.

      Look at the results of your regression. Take just the Days and Days^2
      coefficients and calculate the marginal net points those terms
      contribute for Days = 1, 2, 3, 4, etc. You'll see a max at 3.

      > > Let me also ask -- is Days = 0
      > > if a team plays back to back nights or is that Days = 1?
      > >
      > The latter. I am subtracting game dates from each other.

      So 2 days of rest is optimal.

      > > I'm sure there are other ways to manipulate things, but this looks
      > > like a pretty good thing. I'm saving it.
      > >
      > > Home is a binary 1/0 indicator for home/road, resp?
      > Yes.
      > Okay. Here are the results for 03-04. For the Matchup Probability, I
      > the team records heading into the game. For example, for two teams
      > their first games of the season, I would use 0-0 records for each
      team in my
      > probability calculation.

      I was curious to see how you handled the early games of the season,
      especially the times where one team was undefeated. It looks like you
      used Pythagorean projections, rather than real records anyway. That
      helps. But 0-0 usually requires some other assumption, like a
      Bayesian prior that carries through the first few games.

      >Interestingly, this doesn't seem to affect the
      > regression results too much. The effect of Days between games is
      reduced in
      > this sample. Weird.

      Not sure what to make of that weakening of the Days. What was the R2
      of the previous version? We may have to improve the prior matchup P
      to get back a reasonable estimate of the value of Days. If you just
      look at games beyond the first 20 in the season, does r2 get better
      and does Days become more significant?

      > PtsDiff = - 13.6 + 7.31 Home +0.000027 Distance + 18.1 WinProb + 0.722
      > Days - 0.122 Days^2
      > Predictor Coef SE Coef T P
      > Constant -13.582 1.173 -11.58 0.000
      > Home 7.3056 0.5010 14.58 0.000
      > Distance 0.0000269 0.0003734 0.07 0.943
      > WinProb 18.054 1.163 15.53 0.000
      > Days 0.7221 0.7202 1.00 0.316
      > Days^2 -0.1216 0.1138 -1.07 0.286
      > S = 11.48 R-Sq = 16.7% R-Sq(adj) = 16.6%
      > Analysis of Variance
      > Source DF SS MS F P
      > Regression 5 62072 12414 94.19 0.000
      > Residual Error 2343 308806 132
      > Total 2348 370877


      Dean Oliver
      Author, Basketball on Paper
      "Excellent writing. There are a lot of math guys who just rush from
      the numbers to the conclusion. . .they'll tell you that Shaq is a real
      good player but his team would win a couple more games a year if he
      could hit a free throw. Dean is more than that; he's really
      struggling to understand the actual problem, rather than the
      statistical after-image of it. I learn a lot by reading him." Bill
      James, author Baseball Abstract
    • Show all 19 messages in this topic