- Dear list

a strange question

I've to estimate median value of a certain vector Z. Such vector increases with time; it's like a box that is filled any moment. I can compute basic statistics (mean, median, min, max, #value, std dev) but I can't preserve them. I mean that in any moment I can use data to compute statistics but immediately after I've to delete them....now, my question is...how to update basic statistics in real time?...only knowing the basic statistics of the previous step?...

For example:

minimim value = min(prev_step) - min(actual_step) if min(actual_step) < min(prev_step)

maximum value = max(prev_step) + max(actual_step) if max(actual_step) > max(prev_step)

mean value = {[mean(prev_step)*#value(prev_step)]+sum(actual_step)}/[#value(prev_step)+#value(actual_step)]

but I can't image some similar algorithm for median and standard deviation computation...

Any idea?

Thank you

Simone

-----------------------------

Dr. Simone Sammartino

PhD student

- Geostatistical analyst

- G.I.S. mapping

I.A.M.C. - C.N.R.

Geomare-Sud section

Port of Naples - Naples

marenostrum@...

-----------------------------

____________________________________________________________

Navighi a 4 MEGA e i primi 3 mesi sono GRATIS.

Scegli Libero Adsl Flat senza limiti su http://www.libero.it - On 16-Jun-05 Simone Sammartino wrote:
> Dear list

The standard deviation is easy: you need to construct the

> a strange question

> I've to estimate median value of a certain vector Z. Such

> vector increases with time; it's like a box that is filled

> any moment.

> I can compute basic statistics (mean, median, min, max,

> #value, std dev) but I can't preserve them. I mean that in

> any moment I can use data to compute statistics but immediately

> after I've to delete them....now, my question is...how to

> update basic statistics in real time?...only knowing the basic

> statistics of the previous step?...

> For example:

> minimim value = min(prev_step) - min(actual_step) if

> min(actual_step) < min(prev_step)

> maximum value = max(prev_step) + max(actual_step) if

> max(actual_step) > max(prev_step)

> mean value = {[mean(prev_step)*#value(prev_step)] +

> sum(actual_step)}/[#value(prev_step)+#value(actual_step)]

> but I can't image some similar algorithm for median and

> standard deviation computation...

> Any idea?

> Thank you

> Simone

sum of the data and the sum of squares of the data at the

previous step and at the current step.

Therefore

Previous step:

sum(data) = mean(data) * number(data)

sum(data^2) = (number(data)-1) * SD(data)^2

+ number(data) * mean(data)^2

Current step:

Add new X to sum(data), (new X)^2 to sum(data^2),

1 to number(data), and re-compute the SD.

However, the median is impossible, from the information

you will have in hand. The reason is that, given the previous

median and the new X, you need to know how many of the previously

obtained data values lie between the previous median and the

new X (and the value of at least one of them). You do not have

this information.

As a more general comment, I am wondering why you are in this

situation. There is something apparently contradictory in your

description.

You say you can, at any time, obtain the summary statistics,

but you must then immediately delete them. But you are also

asking how you can update them from the summary statistics

of the previous step and the new data. But if you have deleted

them from the previous step, how can you use them later?

Furthermore, if you have the possibility at each step to perform

the kind of updating computation we are discussing, how is it

that you do not have the possibility simply to store the data,

or the sequence of summary statistics?

I am puzzled ... !

Best wishes,

Ted.

--------------------------------------------------------------------

E-Mail: (Ted Harding) <Ted.Harding@...>

Fax-to-email: +44 (0)870 094 0861

Date: 16-Jun-05 Time: 17:30:57

------------------------------ XFMail ------------------------------