Loading ...
Sorry, an error occurred while loading the content.

GEOSTATS: r.e. heteroscedasticity of

Expand Messages
    Hello, Here s a post of the many helpful responses I received to a question about heteroscedasticity of my residuals from a regression analysis. What I tried
    Message 1 of 1 , Jun 22, 2000
    • 0 Attachment

      Here's a post of the many helpful responses I received to a question about
      heteroscedasticity of my residuals from a regression analysis. What I tried
      was giving all of my zero values a very tiny, near-zero number proportional
      to the mean value of all of the points in the neighborhood of the zero
      value. I then normal score transformed the data, performed my regression
      and the variography and kriging of the residuals, and then backtransformed
      my final surface of normal scores. My residual plots then did not indicate
      any bias, interestingly, and I couldn't really find any evidence that any of
      my results performed very poorly, as judged by a cross validation procedure.

      I'm currently trying to find some information about potential theoretical
      problems with using normal scored data in this sort of analysis. It seems
      that normal scored data are used in sequential gaussian conditional
      simulation in gslib�It also seems to me that normal scoring is kind of like
      a ranking procedure, not like the lognormal transformation, and thus the
      same problems associated with kriging of lognormal data do not necessarily
      exist. However, there are probably some other theoretical problems with
      using normal scored data that I am not aware of. If anyone has any pointers
      r.e. this, I'd be happy to hear them!

      Again thank you so much for your help! Original message, and then responses
      to follow:

      >I'm wondering if anyone might have some advice on this one: I'm basically
      >trying to model the spatial distribution of the importance of a certain
      >forest type using a combination of multiple linear regression and kriging
      >(~universal kriging). I have wall to wall coverages of the dependent
      >variables: precipitation, slope, elevation, aspect, road density, distance
      >to roads, and various satellite variables (vegetation indices); I have ~700
      >spatially referenced field samples where there is a measurement of the

      >dependent variable: amount of spruce-fir forest type. I transformed all
      >the variables, trying to make them as normal as possible.
      >I've found, using stepwise linear regression, that the best model uses 4 of
      >the dependent variables, and has an r^2 of about .45 (which I found kind of
      >exciting, given the amount of noise in this sort of thing). My intention
      >to then krig the residuals of the estimates from this model, assuming they
      >exhibit spatial autocorrelation, which they do. Adding the estimates from
      >both procedures will hopefully yield me a "better" set of estimates than
      >either procedure alone.
      >My worry, however, is that when I examine the residuals from my multiple
      >linear regression, I find that the plot of the residuals (y axis) vs. the
      >fitted value (x axis) indicate heteroscedasticity (they are more
      >concentrated around 0 at low values of x, and spread out as x increases (a
      >megaphone form)). They are normally distributed around 0, however, and do
      >not show any spatial pattern.
      >I have transformed the heck out of everything, and I have tried in a rather
      >clumsy way to implement weighted least squares regression (in Minitab--the
      >online help is very weak on this!), with poor results (the residual plot
      >remains very much the same). I also removed outliers to the point where I
      >felt a little guilty, but without much impact (although minitab still tells
      >me that there are lots of "unusual observations.....)
      >One clue: the dependent variable has all sorts of zero values--there are
      >about 200 of the 700 that have a measurement of "0 spruce-fir" found at
      >plot. I removed the zero values, then ran the regression on the remaining
      >values, yielding a nearly normal distribution, but the residual plot did
      >change much (still the megaphone shape). I looked at scatterplots of all
      >the independents vs. the dependent, and saw a little evidence of
      >variance in the x across values of y, but it didn't seem dramatic. I also
      >plotted the absolute values of the residuals vs. the independents, and
      >didn't see any crazy relationship in terms of non homogeneous variance..
      >The correlation coefficients of the independents vs. the dependent are all
      >between .3 and .5; the scatterplots are pretty fat...
      >My next course of action is to try doing a principal components analysis on
      >the independent variables, and using a pc in the regression analysis. I
      >also going to look into some sort of nonparametric regression.. I'd really
      >like to just stick with the model I came up with, however....

      >Does anyone have any good ideas? Should I worry about the
      >heteroscedasticity of the data, given my goals (from what I read, it seems
      >like one mostly worries about heteroscedasticity when considering
      >intervals...However, I'd like my predictions not to be biased...)
      >Sorry if this is an inane question!
      >I'll post any responses...
      >Thank you,

      andrew: Let me recommend generalized linear mixed modeling (GLiMM). See
      and Stroup (1997), JABES 2: 157-178. Briefly, this may allow you to model a
      binomial outcome (or other members of hte exponential family) that exhibits
      spatial covariance. I say "may" because you may need to face a few caveats
      you may need to go through a little trial and error. Allison (1999, pp
      in SAS's Logistic Regression text covers some concerns--as does the macro
      documentation (see below). The big deal, however, is that you should be
      able to
      avoid the outlier and transformation approaches you've used to date. If you
      have access to SAS, GLiMM can be performed using the GLIMMIX macro. The
      documentation is provided in the macro and in the PROC MIXED documentation
      (which GLIMMIX calls). The macro is available from www.sas.com and as a
      under STAT.

      Brian Gray


      to me the problem looks like functional misspecification and not like a
      distributional problem. have you tried squares or squesrroots of the driver
      variables and interaction terms? you can either try these things or do a
      test for functional specification or analyse the residual versus the driver
      vareiables. there is a very simple test for functional misspecification: the
      Ramsey test. you calculate the residual and the square it and cube it. add
      the squared and cubed residual to the regression model and see whether the r
      square increases significantly (f-test if you want). i typically do all
      these things iteratively until i am satisfied.

      i would not worry about weighted regression and stuff like that but i would
      develop the model manually instead of stepwise regression. typically one
      needs meny runs=model respecifications to get an understanding of the data.
      i find graphs or tables of lefthand side variables versus drivers always
      very usefull as a first step.

      Otto Hellwig


      A few observations

      1. You mention having transformed "everything" to obtain "normality", I
      would assume that ultimately you want to obtain interpolates for the
      original data (not the transformed data nor the residual data). Simply
      exponentiating will lead to problems. Consider the simpler problem of a
      constant mean (no auxillary variables) and a log transformation for the
      variable of interest. If you model a variogram from the transformed data,
      use the transformed data and this variogram to krige then you will have a
      multiplicative bias if you simply exponentiate the kriged values. Under an
      assumption of multivariate lognormality for the original you can compute
      the bias adjustment (see for example Journel, Math. Geology, circa 1980).
      For any other transformation than logarithmic and without the lognormality
      assumption it is not possible to compute the bias adjustment (even knowing
      the multivariate distribution type and the analytic form of the transform
      is usually not enough to actually do it).

      As an ad-hoc solution (described in Journel and Huigbrechts, 1978)
      for the
      log transform case, compute the mean of the original untransformed data,
      compute the
      mean of the kriged values. Take the quotient, this is your
      bias adjustment.

      Did you transform the original dependent variable data or the residuals? If
      the former then you have real problems, if the later then your problem is
      close at least to that described above.

      You mention universal kriging but don't seem to have used it. Having
      identified the 4 critical auxillary variables you might try UK using
      "external drift". This avoids having to go back and compute the value of
      the regression term at non-data locations.

      Although there are a lot of similarities between regression (multivariate)
      and kriging, the underlying assumptions are quite different. One should be
      a little careful mixing them.

      Donal Myers



      I agree with the post that suggested that you try other (nonlinear) forms
      of regression if your data is not naturally continuous and/or linear.
      Linearizing transforms are often the culprits when residuals are
      heteroskedastic. If you are working with continuous data, which you have
      transformed to obtain normal distributions then the advice below may

      Usually, parametric statistical tests like multiple regression are
      much more robust to non-normality than they are to heterogeniety of variance
      problems, so transforms are generally geared toward creating homoskedastic
      residuals rather than normal data. The fan-shaped or
      megaphone-shaped residual plot you describe suggests that you need to
      transform the data one more time using a log or square root transform.
      (Most stats books on linear regression describe this problem and suggest
      appropriate transforms based on shape of the residual plot.)

      Additionally, multiple regression tests lose power when independent
      variables are correlated, which is why PCA is used to remove correlations
      before doing multiple regression or alternatively, one variable from a set
      of correlated variables is used for the regression test. If the variables
      you are using are correlated, your r^2 may be even better than .45.

      It's also important to remember that step-wise regression may produce a
      "best fit" set of variables that is unique to the particular data set on
      which it is used, but the "best fit" set may not generalize to new data
      sets. Therefore it should generally be used only to give you a feel for
      the data sets and the variables, but you should try to base your published
      work on either established theory or personal experience (your own
      theories) about the variables involved (i.e., try to test a priori models
      rather than post hoc models).

      Joyce Witebsky


      I would suggest to assume Poisson-like distributions for the regression
      residual, which would lead you to fit Generalized Linear (or Additive)
      Models for the trend. This deals well with variances that are
      Proportional to the mean value.

      Residuals from such a regression are non-stationary by nature.
      them is possible, if you assume a non-stationary (mean dependent)
      covariogram, derived from a stationary correllogram. I wrote a paper (+
      software) for
      this for the geostats conference in Cape town, which is in PDF available on
      the web. Follow Conference paper Nr. 4 (Pebesma/Bio/Duin) in my publications
      Edzer Pebesma


      I've done something similar and used blocking or stratification of the
      residuals to deal with heteroscedasticity instead of transformation.
      Transformation is really not advisable because of having to then add the
      kriging variances in the back-transformation. See the pdf file on my
      webpage below.

      Yetta Jager
      WEBpage: http://www.esd.ornl.gov/~zij/


      In OLS regression, the assumption is that you have independent observations.
      Your description gives me the sense that your observations are highly
      autocorrelated. Your correlations are thus invalid. Consider strategies to
      get rid of the spatial autocorrelation. The PCA won't do it since your
      observations are
      not of equal worth.

      Art Getis



      I understand from your email that you are trying to map the
      "amount of spruce fir forest type using wal to wall coverages of
      a number of predictor variables.

      Instead of multiple regression, you might find the predictions
      provided by classification and regression trees (CART) more
      useful. For an example of CART and kriging applied to a
      geological data set, see http://www.isaaks.com, and click on
      EDA (exploratory data analysis). Here, the response variable is
      the amount of copper while the predictors are an assortment of
      geological descriptors.

      Ed Isaaks

      Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com

      *To post a message to the list, send it to ai-geostats@....
      *As a general service to list users, please remember to post a summary
      of any useful responses to your questions.
      *To unsubscribe, send email to majordomo@... with no subject and
      "unsubscribe ai-geostats" in the message body.
      DO NOT SEND Subscribe/Unsubscribe requests to the list!
    Your message has been successfully submitted and would be delivered to recipients shortly.