I am struggling with the following problem: I am trying to model water
surface in the southern Florida with data coming from 200 different water
stations. The area is divided by canals and levees so there is no real
hydrological connection for the entire area. If I split the area by canals /
levees there are 9 distinct smaller areas inside which I can assume
hydrological connection, but some areas have too little data to do kriging �
for example. So I am trying to use deterministic methods to see if I am
getting anything meaningful. I cannot put aside 10 or 20% of the data and do
a validation because I already have a too small set of data as it is for the
big area I am working on.
Later on I may have some data to try to do a comparison between known data
points and predictions, but for now I don�t have access to this second set
of data and I have to generate some discussion if not some results for the
end of the year report (as usual). So � supposing I have two interpolation
models and the single �statistics� I have access to is cross-validation with
mean and root mean square errors. Comparing the cross-validation statistics
of model A and model B I have: m(A) < m(B) and RMS(A) > RMS(B). Which model
I should expect to be better? If you want to work with numbers, let�s
suppose that m(A) = 0.01, RMS(A) = 1.05, and m(B) = 0.04 and RMS(B) = 0.7.
Any suggestions will be greatly appreciated.
On the road to retirement? Check out MSN Life Events for advice on how to
get there! http://lifeevents.msn.com/category.aspx?cid=Retirement