>--- In neat@yahoogroups.com, Colin Green <cgreen@...> wrote:

I think I'm going to use a moving average and moving std. deviation. So

>

>

>>Hi Mike, Emyr, All

>>

>>This definitely seems to be the most sound approach to take to me. It

>>does introduce the problem of having to calculate (sample) the mean and

>>standard deviation from a sample set of data which, as I was saying

>>before, you should then avoid for training and testing purposes - which

>>is a problem if, like me, you already have limited data.

>>

>>

>

>Good point.

>

>

there'll be an initial set of data used to sample the mean and standard

deviation that I won't use to input into the ANN, but as I progress into

the data for applying to the ANN I will update the moving variables as

more data becomes available. Think of this as the sample size increasing

and therefor the mean and std. deviation gradually getting closer to

their 'true' values.

>>Currently I'm using the daily percentage change in price as an input

Indeed. Actually what do you think of inputiing the difference (in

>>signal as you suggested, but I use the raw figure as it's typically

>>within the range -1 to 1 anyway (in fact more like +-0.1 of course),

>>

>>

>

>I can't see anything wrong with that. The main thing is to get away

>from input values that are scaled differently for different shares,

>which is likely to confuse the dickens out of your networks.

>

>

percentage terms) between the current price and a moving average, or set

of moving averages each defined over a differen period. Potentially the

ANN could recreate this sort of system given the price change each day,

but if it is useful then why not supply it directly? I suppose the

question arises of what moving average period to use. On the whole

though I reckon it's a sound idea given that reversion to mean is a very

real phenomenon in financial markets.

>

Perhaps you could a moving sample set as metioned above. I was actually

>I have always felt that having some idea of the distribution of data

>points historically ought to lead to identifying the most appropriate

>transformations. When you have a strange distribution, say one where

>there are frequent small positive values and occasional large negative

>ones, most "usual" transformations will compress the normal case into

>a very small range which may in fact be removing information. I've

>tried in the past applying a preprocessor that tranforms inputs into

>percentiles from a given historic range, so that the small normal

>range is expanded. It might be better to apply some sort of continuous

>function; this is an area where my knowledge needs some significant

>increase, I'm afraid.

>

>

thinking in that case (above) that the sample set would continue

increasing forever, but actually traditional moving averages are

calculated using a 'sliding window' sample set. Using such a technique

long period of small movements would result in a 'moving' mean and std.

dev. appropriate to the local (recent) pattern of price movements, but

then the large spikes would be of the scale. Perhaps the answer is to

have long and short term moving sample sets and to then have inputs for

each of these sample sets? So you have todays price inputted in units of

the short term std dev. and the long term.

>I have played around with the idea of extending the power of the input

Golden Rule: keep it simple!

>node to include a genetically-coded input transformation. I haven't

>managed to get beyond playing yet - there really do need to be at

>least three or four hours in a day.

>

>

Regards,

Colin