I would leave out the MatchupP variable, since it is a lot like the

dependent variable. Including it probably increases R-squared a

lot, but probably doesn't do much else. (All in all, it probably is

pretty harmless, since it unlikely to be correlated with your

independent variables.)

Another option with your day variable is to enter it as a series of

dummy variables.

DAY0 - equals 1 if 0 days of rest, 0 otherwise

DAY1 - equals 1 if 1 day of rest , 0 otherwise

DAY2 - equals 1 if 2 days of rest, 0 otherwise

DAY3 - equals 1 if 3 days of rest, 0 otherwise

DAY4+ - equals 1 if 4 days or more of rest, 0 otherwise

Then run the regression leaving one of those variables out.

If, for example, you left DAY0 out of the regression, the DAY1

coefficient would give you the effect of playing on one day's rest

versus playing in a back-to-back.

The DAY2 coefficent would give you the effect of playing on two

days' rest versus playing in a back-to-back.

The DAY3 coefficent would give you the effect of playing on three

days' rest versus playing in a back-to-back.

The DAY4+ coefficent would give you the effect of playing on four or

more days' rest versus playing in a back-to-back.

--- In APBR_analysis@yahoogroups.com, igor eduardo küpfer

<edkupfer@r...> wrote:> Whoa! Lotta traffic at apbr_*. To add to the bustle, I post the

following,

> an unfinished study of mine on the effects of team travel on

winning.

>

begin to

> ***

>

> What are the factors in travel that affect team performance? To

> answer this question, I regressed the results of the 2000-01

season against

> four variables:

next

>

> Days - Number of days between games, eg back to back games = 1

> Dist - Distance in miles from location of previous game to

> game.[1]

between road

> Home - Home court dummy, home court = 1 away game = 0

> MatchupP - Probability of team winning game [2]

>

> I used the points differential for each game as the response.

>

> I made the following assumption: that teams never return home

> games; that is, if the Spurs played in Seattle just before the All-

Star

> break, and played in Miami just following the break, I used the

travel

> distance between Seattle and Miami, even though the Spurs likely

went back

> to San Antonio between games. Also, to remove the ambiguity of the

amount of

> rest at the beginning of the season, I removed each team's first

game of the

> season from my sample.

variables

>

> The results show that only the Matchup Probability and Home/Away

> are significant at 5%. (Regression results appended to end of this

post.)

> The Days Between Games variable is not significant (p = 0.121) but

I think

> that may be an artefact of my sample, because I've seen other

study which

> show a significant relationship.

method:

>

> ***

>

> [1 Distance approximated using the following method:

>

> Distance between cities = sqrt(x * x + y * y)

>

> where x = 69.1 * (lat2 - lat1)

> and y = 69.1 * (lon2 - lon1) * cos(lat1/57.3) ]

>

> [2 Probability of team win calculated using Bill James's log5

>

respectively. For

> Pr(team win) = (A - A * B) / (A + B - 2 * A * B)

>

> where A and B = team A's and team B's winning percentage,

> this study, I used Pythagorean winning percentages instead.]

MatchupP

>

>

> REGRESSION OUTPUT

>

> The regression equation is

> PtsDiff = - 18.7 + 0.391 Days +0.000051 Dist + 5.76 Home + 29.9

>

> Predictor Coef SE Coef T P

> Constant -18.6980 0.8449 -22.13 0.000

> Days 0.3908 0.2518 1.55 0.121

> Dist 0.0000513 0.0003559 0.14 0.886

> Home 5.7589 0.4832 11.92 0.000

> MatchupP 29.914 1.135 26.35 0.000

>

> S = 11.11 R-Sq = 26.9% R-Sq(adj) = 26.7%

>

> Analysis of Variance

>

> Source DF SS MS F P

> Regression 4 106275 26569 215.13 0.000

> Residual Error 2344 289491 124

> Lack of Fit 2239 278271 124 1.16 0.158

> Pure Error 105 11220 107

> Total 2348 395767

>

> 2139 rows with no replicates

>

> --

>

> ed