Loading ...
Sorry, an error occurred while loading the content.

repost of Re: GEOSTATS: heteroscedasticity in mult. lin. regression

Expand Messages
  • brgray
    for those who received an incomplete copy of my earlier post, here s a repost. I will appreciate all comments. Brian Gray andrew: Let me recommend
    Message 1 of 1 , Jun 8, 2000
    • 0 Attachment
      for those who received an incomplete copy of my earlier post, here's a repost. I
      will appreciate all comments.
      Brian Gray

      andrew: Let me recommend generalized linear mixed modeling (GLiMM). See Gotway
      and Stroup (1997), JABES 2: 157-178. Briefly, this may allow you to model a
      binomial outcome (or other members of hte exponential family) that exhibits
      spatial covariance. I say "may" because you may need to face a few caveats and you
      may need to go through a little trial and error. Allison (1999, pp 206-213) in
      SAS's Logistic Regression text covers some concerns--as does the macro
      documentation (see below). The big deal, however, is that you should be able to
      avoid the outlier and transformation approaches you've used to date. If you have
      access to SAS, GLiMM can be performed using the GLIMMIX macro. The
      documentation is provided in the macro and in the PROC MIXED documentation (which
      GLIMMIX calls). The macro is available from www.sas.com and as a sample under
      STAT. Email or call me if you have questions. Brian Gray

      Bill Thayer wrote:

      > Hello,
      >
      > The beginning of your message was missing, would you mind resending it to
      > the list?
      >
      > At 09:35 AM 6/8/00 -0400, you wrote:
      > >and Stroup (1997), JABES 2: 157-178. Briefly, this may allow you to model a
      > >binomial outcome (or other members of hte exponential family) that exhibits
      > >spatial covariance. I say "may" because you may need to face a few
      > caveats and
      > >you may need to go through a little trial and error. Allison (1999, pp
      > >206-213)
      > >in SAS's Logistic Regression text covers some concerns--as does the macro
      > >documentation (see below). The big deal, however, is that you should be
      > >able to
      > >avoid the outlier and transformation approaches you've used to date. If you
      > >have access to SAS, GLiMM can be performed using the GLIMMIX macro. The
      > >documentation is provided in the macro and in the PROC MIXED documentation
      > >(which GLIMMIX calls). The macro is available from www.sas.com and as a
      > sample
      > >under STAT. Email or call me if you have questions. Brian Gray
      > >
      > >NEFIA NEFIA wrote:
      > >
      > >> Hello list!
      > >>
      > >> I'm wondering if anyone might have some advice on this one: I'm basically
      > >> trying to model the spatial distribution of the importance of a certain
      > >> forest type using a combination of multiple linear regression and kriging
      > >> (~universal kriging). I have wall to wall coverages of the dependent
      > >> variables: precipitation, slope, elevation, aspect, road density, distance
      > >> to roads, and various satellite variables (vegetation indices); I have ~700
      > >> spatially referenced field samples where there is a measurement of the
      > >> dependent variable: amount of spruce-fir forest type. I transformed all of
      > >> the variables, trying to make them as normal as possible.
      > >>
      > >> I've found, using stepwise linear regression, that the best model uses 4 of
      > >> the dependent variables, and has an r^2 of about .45 (which I found kind of
      > >> exciting, given the amount of noise in this sort of thing). My intention is
      > >> to then krig the residuals of the estimates from this model, assuming they
      > >> exhibit spatial autocorrelation, which they do. Adding the estimates from
      > >> both procedures will hopefully yield me a "better" set of estimates than
      > >> either procedure alone.
      > >>
      > >> My worry, however, is that when I examine the residuals from my multiple
      > >> linear regression, I find that the plot of the residuals (y axis) vs. the
      > >> fitted value (x axis) indicate heteroscedasticity (they are more
      > >> concentrated around 0 at low values of x, and spread out as x increases (a
      > >> megaphone form)). They are normally distributed around 0, however, and do
      > >> not show any spatial pattern.
      > >>
      > >> I have transformed the heck out of everything, and I have tried in a rather
      > >> clumsy way to implement weighted least squares regression (in Minitab--the
      > >> online help is very weak on this!), with poor results (the residual plot
      > >> remains very much the same). I also removed outliers to the point where I
      > >> felt a little guilty, but without much impact (although minitab still tells
      > >> me that there are lots of "unusual observations.....)
      > >>
      > >> One clue: the dependent variable has all sorts of zero values--there are
      > >> about 200 of the 700 that have a measurement of "0 spruce-fir" found at
      > >> plot. I removed the zero values, then ran the regression on the remaining
      > >> values, yielding a nearly normal distribution, but the residual plot did not
      > >> change much (still the megaphone shape). I looked at scatterplots of all of
      > >> the independents vs. the dependent, and saw a little evidence of nonconstant
      > >> variance in the x across values of y, but it didn't seem dramatic. I also
      > >> plotted the absolute values of the residuals vs. the independents, and
      > >> didn't see any crazy relationship in terms of non homogeneous variance..
      > >> The correlation coefficients of the independents vs. the dependent are all
      > >> between .3 and .5; the scatterplots are pretty fat...
      > >>
      > >> My next course of action is to try doing a principal components analysis on
      > >> the independent variables, and using a pc in the regression analysis. I was
      > >> also going to look into some sort of nonparametric regression.. I'd really
      > >> like to just stick with the model I came up with, however....
      > >>
      > >> Does anyone have any good ideas? Should I worry about the
      > >> heteroscedasticity of the data, given my goals (from what I read, it seems
      > >> like one mostly worries about heteroscedasticity when considering confidence
      > >> intervals...However, I'd like my predictions not to be biased...)
      > >>
      > >> Sorry if this is an inane question!
      > >>
      > >> I'll post any responses...
      > >>
      > >> Thank you,
      > >> Andrew
      > >>
      > >> ________________________________________________________________________
      > >> Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
      > >>
      > >> --
      > >> *To post a message to the list, send it to ai-geostats@....
      > >> *As a general service to list users, please remember to post a summary
      > >> of any useful responses to your questions.
      > >> *To unsubscribe, send email to majordomo@... with no subject and
      > >> "unsubscribe ai-geostats" in the message body.
      > >> DO NOT SEND Subscribe/Unsubscribe requests to the list!
      > >
      > >--
      > >****************************************************************
      > >* Brian R. Gray
      > >* Department of Epidemiology and Biostatistics
      > >* School of Public Health
      > >* University of South Carolina
      > >* Columbia, SC 29208
      > >* phone (803) 777-1765; fax (803) 777-8769; email brgray@...
      > >****************************************************************
      > >
      > >
      > >--
      > >*To post a message to the list, send it to ai-geostats@....
      > >*As a general service to list users, please remember to post a summary
      > >of any useful responses to your questions.
      > >*To unsubscribe, send email to majordomo@... with no subject and
      > >"unsubscribe ai-geostats" in the message body.
      > >DO NOT SEND Subscribe/Unsubscribe requests to the list!
      >
      > **************************************************
      > William C. Thayer, P.E.
      >
      > Environmental Science Center
      > Syracuse Research Corporation
      > 6225 Running Ridge Road
      > North Syracuse, NY 13212-2510
      > phone: (315) 452-8424
      > fax: (315) 452-8090
      > email: thayer@...
      > **************************************************

      --
      ****************************************************************
      * Brian R. Gray
      * Department of Epidemiology and Biostatistics
      * School of Public Health
      * University of South Carolina
      * Columbia, SC 29208
      * phone (803) 777-1765; fax (803) 777-8769; email brgray@...
      ****************************************************************


      --
      *To post a message to the list, send it to ai-geostats@....
      *As a general service to list users, please remember to post a summary
      of any useful responses to your questions.
      *To unsubscribe, send email to majordomo@... with no subject and
      "unsubscribe ai-geostats" in the message body.
      DO NOT SEND Subscribe/Unsubscribe requests to the list!
    Your message has been successfully submitted and would be delivered to recipients shortly.