[ai-geostats] Suggestions on geo-spatial modeling in insurance
Suggestions on geo-spatial modeling in insurance
I am working with a very interesting set of data on individual insurance claims settlements related to auto accidents. I was hoping that ai-geostats list participants might point me towards some ways to think the model described below.
The data in my sample provide a number of covariates, but unfortunately a very limited amount of geographic detail. In particular for each accident recorded in the data set, the most detailed level of information is at the 5-digit zip code level. The variable I wish to explain -- the dollar value of the claim settlement -- is strictly positive for this sample and shows very strong evidence of overdispersion (that is, the variable of the data increases in proportion to the mean). As a result I have been using a generalized linear model, assuming the disturbance term follows a gamma distribution ("gamma regression"). To date I have not made any use of the available zip code information. The results without any geospatial effects have do far been quite encouraging. (note: I have been using SAS for estimation)
Now I would like to expand the gamma regression to include the available spatial characteristics. My ultimate aim is to explore the extent to which including information on the location of an accident, after controlling for all other factors, contributes to the explanatory and ultimately predictive power or the model.
On obvious way to do this is to simply create a series of binary (0/1) values corresponding the all of the zip codes in my sample, and this set to the current covariates. Another way would be to use some form of geographically weighted regression. I would be very interested in hearing opinions on other ways I could incorporate the available information on zip codes into the
Model. I apologize for the rather general nature of this inquiry.