Loading ...
Sorry, an error occurred while loading the content.

Re: [neat] Re: NEAT enhancements

Expand Messages
  • Colin Green
    ... I think I m going to use a moving average and moving std. deviation. So there ll be an initial set of data used to sample the mean and standard deviation
    Message 1 of 17 , Sep 1, 2006
      Mike Woodhouse wrote:

      >--- In neat@yahoogroups.com, Colin Green <cgreen@...> wrote:
      >>Hi Mike, Emyr, All
      >>This definitely seems to be the most sound approach to take to me. It
      >>does introduce the problem of having to calculate (sample) the mean and
      >>standard deviation from a sample set of data which, as I was saying
      >>before, you should then avoid for training and testing purposes - which
      >>is a problem if, like me, you already have limited data.
      >Good point.

      I think I'm going to use a moving average and moving std. deviation. So
      there'll be an initial set of data used to sample the mean and standard
      deviation that I won't use to input into the ANN, but as I progress into
      the data for applying to the ANN I will update the moving variables as
      more data becomes available. Think of this as the sample size increasing
      and therefor the mean and std. deviation gradually getting closer to
      their 'true' values.

      >>Currently I'm using the daily percentage change in price as an input
      >>signal as you suggested, but I use the raw figure as it's typically
      >>within the range -1 to 1 anyway (in fact more like +-0.1 of course),
      >I can't see anything wrong with that. The main thing is to get away
      >from input values that are scaled differently for different shares,
      >which is likely to confuse the dickens out of your networks.

      Indeed. Actually what do you think of inputiing the difference (in
      percentage terms) between the current price and a moving average, or set
      of moving averages each defined over a differen period. Potentially the
      ANN could recreate this sort of system given the price change each day,
      but if it is useful then why not supply it directly? I suppose the
      question arises of what moving average period to use. On the whole
      though I reckon it's a sound idea given that reversion to mean is a very
      real phenomenon in financial markets.

      >I have always felt that having some idea of the distribution of data
      >points historically ought to lead to identifying the most appropriate
      >transformations. When you have a strange distribution, say one where
      >there are frequent small positive values and occasional large negative
      >ones, most "usual" transformations will compress the normal case into
      >a very small range which may in fact be removing information. I've
      >tried in the past applying a preprocessor that tranforms inputs into
      >percentiles from a given historic range, so that the small normal
      >range is expanded. It might be better to apply some sort of continuous
      >function; this is an area where my knowledge needs some significant
      >increase, I'm afraid.

      Perhaps you could a moving sample set as metioned above. I was actually
      thinking in that case (above) that the sample set would continue
      increasing forever, but actually traditional moving averages are
      calculated using a 'sliding window' sample set. Using such a technique
      long period of small movements would result in a 'moving' mean and std.
      dev. appropriate to the local (recent) pattern of price movements, but
      then the large spikes would be of the scale. Perhaps the answer is to
      have long and short term moving sample sets and to then have inputs for
      each of these sample sets? So you have todays price inputted in units of
      the short term std dev. and the long term.

      >I have played around with the idea of extending the power of the input
      >node to include a genetically-coded input transformation. I haven't
      >managed to get beyond playing yet - there really do need to be at
      >least three or four hours in a day.

      Golden Rule: keep it simple!


    Your message has been successfully submitted and would be delivered to recipients shortly.