Loading ...
Sorry, an error occurred while loading the content.

Summary: Using SPSS or Statistica for mixed modeling

Expand Messages
  • Will Hopkins
    I had a few promising responses to my request for info about the use of SPSS or Statistica for mixed modeling, and I have followed them up and reached a
    Message 1 of 1 , Jul 28, 2006
    View Source
    • 0 Attachment
      I had a few promising responses to my request for info about the use of SPSS or Statistica for mixed modeling, and I have followed them up and reached a decision. 

      First the responses...

      Frank Verducci <verducci@...>: I have not used the mixed model in Statistica, but the interface for version 6.0 is easy for students to learn.  I have used Statistica in the classroom for the past 10 years with great student success, and the price is reasonable for such a statistical package.  http://www.statsoft.com/

      Pete of Vegan Bodybuilding <pete@...>: Here's a group for statistica users (never been on it, but it might offer you some advice???)  http://groups.yahoo.com/group/statistica/

      Conrad c_earnest_57 <c_earnest_57@...>: Have you ever looked at JMP? It is produced by SAS but the user interface is much friendlier.  It is also nice b/c it works across PC and Mac platforms.  If I am not mistaken I believe that it also has the option for adding coding; however, the user interface is more SPSS'esque. http://www.jmp.com

      Now the follow-up...

      I spent several hours with the mixed model in SPSS version 14.  No good.  Apart from the limited functionality of the random effects, it would not do simple difference-in-the-changes or other customized estimates from the group*time fixed-effect interaction in a controlled trial.  Adding in another between subject effect (sex*group*time) was right out of the question, so you can't estimate the difference in the effect of a treatment between males and females.  Incredible but true.  No wonder people don't get much further than statistical significance of the interaction.

      You can't download a free-trial version of Statistica without a lot of hassles, so I contacted the local agent and had a good hour with him and his statistician driving the latest version 7.  It has quite a good interface, but the random effects were even more limited than in SPSS, and there was the same limitation in dialing up customized estimates, at least in the mixed model.  There appeared to be more functionality for estimates in the usual general linear model, but that's not what I wanted.

      I downloaded SAS's JMP (free 30-day trial) and had a good shot at that.  I had already experienced great disappointment a year or so ago with the windows-menu version of the main SAS package, the so-called Enterprise Guide.  Incredible as it may seem for a package that is so expensive, the mixed model in the Guide platform was dysfunctional.  It may or may not still be, but it's out of the question because of the expense.  JMP is a LOT cheaper, and it is at least an honest attempt at a new jargon-free view of statistics.  But again, you can't dial up the customized estimates that you need routinely for controlled trials.  I really did try, guys.  There are two routes: via the parameter estimates for the model, and via least-squares means.  Well, the parameter estimates are impossibly complicated, because the modeling works properly only if you include all main effects and interactions less than the full sex*group*time interaction, but to combine all those parameters to get the difference between sexes in the difference between groups in the post-pre change is beyond my capabilities, so I can't expect other folks to do it.  The least-squares-means route was more straightforward, but when you combine the levels you want, an inappropriate constant divisor is introduced that you can't suppress.  For example, when you dial up post-pre for control-exptal, you get half the correct answer!  You actually see the 0.5 appear.  I was using data that I had generated with known effects and that I analyzed in the full SAS package, which gave the right answer.  Goodness knows what JMP would give if you wanted to dial up something like a post value minus the mean of two pre values for the exptal minus the control for females minus males.  JMP was more powerful than SPSS and Statistica for specifying random effects, but there was no option for different estimates in different groups.  I was hoping I might be able to access and tweak the command script in JMP to get the right estimates for fixed effects and more flexible random effects, but JMP's script is nothing like the Fortran/Basic of the main SAS package and it is untweakable, by me anyway.

      And my decision...

      Well, this was all about two things.  First, I wanted to be able to estimate different error  variances and other random effects in different groups.  These random effects do differ between groups (repeat after me: there is no such thing as a null effect), and the differences are informative.  Secondly, I wanted to introduce folks to covariates to control for something and thereby to avoid confounding (for something that isn't equal between the groups), to account for individual differences (in non-experimental studies) and individual responses (in experiments), and thereby also to improve precision of all the estimates. 

      I now think I can see how to do all these things reasonably well and conceptually simply with a spreadsheet.  It won't be mixed modeling as such, but for most data it will be as robust and it should be less likely to result in folks choosing wrong modeling and doing wrong adjustments for covariates.  I can also build in all the necessary square roots of variances and any transformations and back-transformations of everything.  And I can build in standardized magnitudes and chances of benefit and harm, something that the packages won't be doing anytime soon. 

      For those who understand this stuff enough to give me feedback (please do), here's the approach I am proposing with a spreadsheet.  These analyses always boil down to a comparison of the mean in two groups.  (For experiments or other repeated measurements on each subject it is the comparison of the mean of the change scores.)  So, first model the covariate(s) in each group separately, using simple or multiple linear regression.  Estimate the effect at some appropriate value of each covariate in each group, along with its standard error.  Then estimate the confidence limits of the difference of the estimated effects using the Satterthwaite approximation.  By analyzing the covariate separately in each group in this manner, you automatically allow for a different error variance in each group.  The difference between the error variances gives the individual differences or responses, controlling for the covariate.  The difference in the estimate of the effect of the covariate in the two groups represents the extent to which the covariate accounts for individual differences or responses in the mean between the groups.  The Satterthwaite approximation gives confidence limits for this estimate, too. 

      Having to choose an "appropriate value" of the covariate forces you to think about exactly what the covariate is doing.  Stats packages control for a covariate by estimating its effect at its mean value over all subjects, the result being the so-called least-squares or marginal mean. But that mean is often not appropriate.  For example, if you have more males than females in the sample and there is a substantial difference in the effect of the covariate in males and females, the mean effect adjusted for the covariate is at a value of the covariate closer to the males' mean value of the covariate, when really it should be at the mean of the males' and females' mean.  In any case, you can also estimate the difference between the groups at different values of the covariate, which amounts to what we could say is "controlling for different values of the covariate".

      Meanwhile, if anyone out there is using another stats package that addresses some of these issues in a manner friendly for inexperienced users, please contact me.   I'd still rather use a package, and as far as I can see you have to use a package to include covariates in the generalized linear modeling that you need for dependent variables representing counts.

      Will

      Will G Hopkins, PhD FACSM
      Work +64 9 921 9793, Fax +64 9 921 9960
      Home +64 9 376 0198, Cell +64 27 427 2518
      Health Science/Sport and Recreation
      AUT University
      Private Bag 92006, Auckland 1020, New Zealand
      will@...
      Statistics: http://newstats.org
      Sportscience: http://sportsci.org
      ---------------------------------
      Be creative: break rules.

    Your message has been successfully submitted and would be delivered to recipients shortly.