Loading ...
Sorry, an error occurred while loading the content.
 

RE: [ai-geostats] Median computation in real time...

Expand Messages
  • Ted Harding
    ... The standard deviation is easy: you need to construct the sum of the data and the sum of squares of the data at the previous step and at the current step.
    Message 1 of 2 , Jun 16, 2005
      On 16-Jun-05 Simone Sammartino wrote:
      > Dear list
      > a strange question
      > I've to estimate median value of a certain vector Z. Such
      > vector increases with time; it's like a box that is filled
      > any moment.
      > I can compute basic statistics (mean, median, min, max,
      > #value, std dev) but I can't preserve them. I mean that in
      > any moment I can use data to compute statistics but immediately
      > after I've to delete them....now, my question is...how to
      > update basic statistics in real time?...only knowing the basic
      > statistics of the previous step?...
      > For example:
      > minimim value = min(prev_step) - min(actual_step) if
      > min(actual_step) < min(prev_step)
      > maximum value = max(prev_step) + max(actual_step) if
      > max(actual_step) > max(prev_step)
      > mean value = {[mean(prev_step)*#value(prev_step)] +
      > sum(actual_step)}/[#value(prev_step)+#value(actual_step)]
      > but I can't image some similar algorithm for median and
      > standard deviation computation...
      > Any idea?
      > Thank you
      > Simone

      The standard deviation is easy: you need to construct the
      sum of the data and the sum of squares of the data at the
      previous step and at the current step.

      Therefore

      Previous step:

      sum(data) = mean(data) * number(data)

      sum(data^2) = (number(data)-1) * SD(data)^2
      + number(data) * mean(data)^2

      Current step:

      Add new X to sum(data), (new X)^2 to sum(data^2),
      1 to number(data), and re-compute the SD.


      However, the median is impossible, from the information
      you will have in hand. The reason is that, given the previous
      median and the new X, you need to know how many of the previously
      obtained data values lie between the previous median and the
      new X (and the value of at least one of them). You do not have
      this information.

      As a more general comment, I am wondering why you are in this
      situation. There is something apparently contradictory in your
      description.

      You say you can, at any time, obtain the summary statistics,
      but you must then immediately delete them. But you are also
      asking how you can update them from the summary statistics
      of the previous step and the new data. But if you have deleted
      them from the previous step, how can you use them later?

      Furthermore, if you have the possibility at each step to perform
      the kind of updating computation we are discussing, how is it
      that you do not have the possibility simply to store the data,
      or the sequence of summary statistics?

      I am puzzled ... !

      Best wishes,
      Ted.


      --------------------------------------------------------------------
      E-Mail: (Ted Harding) <Ted.Harding@...>
      Fax-to-email: +44 (0)870 094 0861
      Date: 16-Jun-05 Time: 17:30:57
      ------------------------------ XFMail ------------------------------
    Your message has been successfully submitted and would be delivered to recipients shortly.