## Re: [GP] prerequisites for using GP

Expand Messages
• Hi Sergey, Given the limited size of your problem, my guess is that a standard GP system with arithmetic function nodes would quickly find a solution with very
Message 1 of 9 , Mar 30, 2009
Hi Sergey,

Given the limited size of your problem, my guess is that a standard GP
system with arithmetic function nodes would quickly find a solution
with very low error on the test set. However, it would likely display
serious overfitting to the data points at hand. That is, if you draw a
seventh data point from the same underlying distribution, there is a
large risk that it would not be adequately predicted by your evolved
expression tree (the error on the "validation set" would likely be
large).

This is something symbolic regression has in common with most
supervised learning algorithms, and unfortunately there is no simple
way around it. It just seems like you don't have enough data.

Julian

2009/3/30 Sergey Grechin <hq9000_vst@...>:
> Hello everybody!
>
> I hope this list is an appropriate place to ask my question.
>
> It's going to be rather lenthy mail since I need to put across some
>
> Imagine we have a system with N numerical inputs and 1 numerical output.
> Let's say that using GP I want to guess the implicit relation between
> inputs and output. The criterion for the better fitness will be the
> lesser summarized error of the candidate solution at K historical samples.
>
> Very classical statement.
>
> Then to be concrete let's say I have N=30 (number of inputs) and K=6
> (number of historical samples). As you see N is several times greater
> than K.
>
> Now, if I wanted to use linear regression (or any parametric model
> with about 30 parameters) - there would be now way to
> proceed since input data set is too small to resolve against 30
> unknown coefficients.
>
> But I can easily try to attack the problem with GP.
> Formally nothing prevents me from doing so.
>
> But if paramtric model fail to deal with this case -
> ****
>
> 1. what is so
> special about symbolic regression that it does allow us to do this?
>
> 2. is there any prerequisites for problem statement to be able to use
> GP?
>
> ****
>
> If we look at this as quantities of information - it's OK to derive
> values for 30 coefficients from 30 facts about the model (30 samples).
> But GP pretends to be able to find a formula interconnecting 30 inputs
> basing on 6 facts...
>
> I'm really sorry for not being able to describe my concerns more
> precisely but I think I've managed to conduct the feel of the
> problem.
>
> Thanks!
>
> Sergey
>
>

--
Julian Togelius
IDSIA
Galleria 2
6928 Manno-Lugano
Switzerland
julian@...
http://julian.togelius.com
http://www.idsia.ch/~togelius
+41-764-110679
+46-705-192088
• It sounds like any evolutionary search-based approach will overfit massively to such a small dataset. I would not even attempt this problem with GP.
Message 2 of 9 , Mar 30, 2009
It sounds like any evolutionary search-based approach will overfit
massively to such a small dataset.
I would not even attempt this problem with GP.

On Mon, Mar 30, 2009 at 5:22 AM, Sergey Grechin <hq9000_vst@...> wrote:
> Hello everybody!
>
> I hope this list is an appropriate place to ask my question.
>
> It's going to be rather lenthy mail since I need to put across some
>
> Imagine we have a system with N numerical inputs and 1 numerical output.
> Let's say that using GP I want to guess the implicit relation between
> inputs and output. The criterion for the better fitness will be the
> lesser summarized error of the candidate solution at K historical samples.
>
> Very classical statement.
>
> Then to be concrete let's say I have N=30 (number of inputs) and K=6
> (number of historical samples). As you see N is several times greater
> than K.
>
> Now, if I wanted to use linear regression (or any parametric model
> with about 30 parameters) - there would be now way to
> proceed since input data set is too small to resolve against 30
> unknown coefficients.
>
> But I can easily try to attack the problem with GP.
> Formally nothing prevents me from doing so.
>
> But if paramtric model fail to deal with this case -
> ****
>
> 1. what is so
> special about symbolic regression that it does allow us to do this?
>
> 2. is there any prerequisites for problem statement to be able to use
> GP?
>
> ****
>
> If we look at this as quantities of information - it's OK to derive
> values for 30 coefficients from 30 facts about the model (30 samples).
> But GP pretends to be able to find a formula interconnecting 30 inputs
> basing on 6 facts...
>
> I'm really sorry for not being able to describe my concerns more
> precisely but I think I've managed to conduct the feel of the
> problem.
>
> Thanks!
>
> Sergey
>
>
• I believe that there have been some successes using GP to analyze microarray data where the ratio of inputs to training data is similar although the scale is
Message 3 of 9 , Mar 30, 2009
I believe that there have been some successes using GP to analyze
microarray data where the ratio of inputs to training data is similar
although the scale is much larger (1000's to 10,000s of input
variables and only 100's of historical data points). You could look
at the techniques that have been used there. Although if you really
only have K=6 I think you're in trouble.

Terry Soule
Department of Computer Science
University of Idaho
Moscow, ID 83843
tsoule@...

On Mon, Mar 30, 2009 at 11:08 PM, Sergey Grechin <hq9000_vst@...> wrote:
> Hi,
>
> Thanks for opinion.
> Any idea where to find for reasonable approach to analyze such data?
> I have no option to get more observations.
>
> I think I can somehow fight this overfitting by harder constrains on
> complexity of the member of the population.
>
> What do you think?
>
> Sergey
>
>>It sounds like any evolutionary search-based approach will overfit
>>massively to such a small dataset.
>>I would not even attempt this problem with GP.
>
>
>
>
>
> ------------------------------------
>
>
>
>
>
• Hi, Thanks for opinion. Any idea where to find for reasonable approach to analyze such data? I have no option to get more observations. I think I can somehow
Message 4 of 9 , Mar 30, 2009
Hi,

Thanks for opinion.
Any idea where to find for reasonable approach to analyze such data?
I have no option to get more observations.

I think I can somehow fight this overfitting by harder constrains on
complexity of the member of the population.

What do you think?

Sergey

>It sounds like any evolutionary search-based approach will overfit
>massively to such a small dataset.
>I would not even attempt this problem with GP.
• On Tue, 31 Mar 2009, R. Muhammad Atif Azad wrote: I agree that your best hope is GP simplifying the functional mapping: it may make a large number of inputs
Message 5 of 9 , Mar 31, 2009

I agree that your best hope is GP simplifying the functional mapping: it may
make a large number of inputs redundant as well. One obvious benchmark may be
(with least squares regression) would be to keep the degree of the polynomial
at least less than the number of sample points (or else there is no data
compression achieved). To avoid overfitting in that case, you may also
consider ridge regression.

On Mon, 30 Mar 2009, Sergey Grechin wrote:

>>
>> Hi,
>>
>> Thanks for opinion.
>> Any idea where to find for reasonable approach to analyze such data?
>> I have no option to get more observations.
>>
>> I think I can somehow fight this overfitting by harder constrains on
>> complexity of the member of the population.
>>
>> What do you think?
>>
>> Sergey
>>
>> >It sounds like any evolutionary search-based approach will overfit
>> >massively to such a small dataset.
>> >I would not even attempt this problem with GP.
• Hi, I think if you just go to Google scholar and do a search on: microarrays genetic programming you should turn up a number of papers. Here s a few
Message 6 of 9 , Mar 31, 2009
Hi,

I think if you just go to Google scholar and do a search on:
microarrays genetic programming
you should turn up a number of papers. Here's a few citations:
@article{hong2006ccb,
title={{The classification of cancer based on DNA microarray data
that uses diverse ensemble genetic programming}},
author={Hong, J.H. and Cho, S.B.},
journal={Artificial Intelligence in Medicine},
volume={36},
number={1},
pages={43--58},
year={2006},
publisher={Elsevier}
}

@article{langdon2004gpm,
title={{Genetic programming for mining DNA chip data from cancer patients}},
author={Langdon, WB and Buxton, BF},
journal={Genetic Programming and Evolvable Machines},
volume={5},
number={3},
pages={251--257},
year={2004},
publisher={Springer}
}

Terry

On Tue, Mar 31, 2009 at 10:11 AM, Sergey Grechin <hq9000_vst@...> wrote:
> Hi Terry,
>
> Could you plase give a name of the paper or link or something that could
>
> Thanks
>
> Sergey
>
>> I believe that there have been some successes using GP to analyze
>> microarray data where the ratio of inputs to training data is similar
>> although the scale is much larger (1000's to 10,000s of input
>> variables and only 100's of historical data points). You could look
>> at the techniques that have been used there. Although if you really
>> only have K=6 I think you're in trouble.
>
>> Terry Soule
>> Department of Computer Science
>> University of Idaho
>> Moscow, ID 83843
>> tsoule@...
>
>> On Mon, Mar 30, 2009 at 11:08 PM, Sergey Grechin <hq9000_vst@...> wrote:
>>> Hi,
>>>
>>> Thanks for opinion.
>>> Any idea where to find for reasonable approach to analyze such data?
>>> I have no option to get more observations.
>>>
>>> I think I can somehow fight this overfitting by harder constrains on
>>> complexity of the member of the population.
>>>
>>> What do you think?
>>>
>>> Sergey
>>>
>>>>It sounds like any evolutionary search-based approach will overfit
>>>>massively to such a small dataset.
>>>>I would not even attempt this problem with GP.
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------
>>>
>>>
>>>
>>>
>>>
>>
>
>
>
> --
> Ñ óâàæåíèåì,
>  Sergey                          mailto:hq9000_vst@...
>
>
>
> ------------------------------------
>
>
>
>
>
• Hi Terry, Could you plase give a name of the paper or link or something that could help me to find more info on this research? Thanks Sergey ... -- Ñ
Message 7 of 9 , Mar 31, 2009
Hi Terry,

Could you plase give a name of the paper or link or something that could

Thanks

Sergey

> I believe that there have been some successes using GP to analyze
> microarray data where the ratio of inputs to training data is similar
> although the scale is much larger (1000's to 10,000s of input
> variables and only 100's of historical data points). You could look
> at the techniques that have been used there. Although if you really
> only have K=6 I think you're in trouble.

> Terry Soule
> Department of Computer Science
> University of Idaho
> Moscow, ID 83843
> tsoule@...

> On Mon, Mar 30, 2009 at 11:08 PM, Sergey Grechin <hq9000_vst@...> wrote:
>> Hi,
>>
>> Thanks for opinion.
>> Any idea where to find for reasonable approach to analyze such data?
>> I have no option to get more observations.
>>
>> I think I can somehow fight this overfitting by harder constrains on
>> complexity of the member of the population.
>>
>> What do you think?
>>
>> Sergey
>>
>>>It sounds like any evolutionary search-based approach will overfit
>>>massively to such a small dataset.
>>>I would not even attempt this problem with GP.
>>
>>
>>
>>
>>
>> ------------------------------------
>>
>>
>>
>>
>>
>

--
Ñ óâàæåíèåì,
Sergey mailto:hq9000_vst@...
• Salam mate, Can u give me a call at my number at 0323 444 9019? Best, Adil Raja ________________________________ From: R. Muhammad Atif Azad
Message 8 of 9 , Apr 1, 2009
Salam mate,
Can u give me a call at my number at 0323 444 9019?

Best,

________________________________
To: genetic_programming@yahoogroups.com
Sent: Tuesday, March 31, 2009 5:44:31 PM
Subject: Re: [GP] prerequisites for using GP

I agree that your best hope is GP simplifying the functional mapping: it may
make a large number of inputs redundant as well. One obvious benchmark may be
(with least squares regression) would be to keep the degree of the polynomial
at least less than the number of sample points (or else there is no data
compression achieved). To avoid overfitting in that case, you may also
consider ridge regression.

On Mon, 30 Mar 2009, Sergey Grechin wrote:

>>
>> Hi,
>>
>> Thanks for opinion.
>> Any idea where to find for reasonable approach to analyze such data?
>> I have no option to get more observations.
>>
>> I think I can somehow fight this overfitting by harder constrains on
>> complexity of the member of the population.
>>
>> What do you think?
>>
>> Sergey
>>
>> >It sounds like any evolutionary search-based approach will overfit
>> >massively to such a small dataset.
>> >I would not even attempt this problem with GP.

[Non-text portions of this message have been removed]
Your message has been successfully submitted and would be delivered to recipients shortly.