Loading ...
Sorry, an error occurred while loading the content.

Re: [GP] Overfitting in GP

Expand Messages
  • Howard Oakley
    ... It sounds like it, Nitin, but you would need to look at the fits themselves to see whether it is. ... It is highly data dependent, in my experience. One
    Message 1 of 4 , Aug 24, 2004
    • 0 Attachment
      On 24/8/04 4:20, Nitin Muttil wrote:

      > I am noticing that when the depth of GP tree is
      > increased, the model performs better in training, but worsens for the test
      > data. I assume this is because of overfitting, something similar to
      > overfitting in neural networks, when hidden layers/nodes are increased.
      >
      > 1) Is this actually overfitting?

      It sounds like it, Nitin, but you would need to look at the fits themselves
      to see whether it is.

      > If so, is there a optimal GP equation size,
      > or has it to be fixed by trial and error?

      It is highly data dependent, in my experience. One person's overfitting can
      be another person's invaluable information! This is no different from any
      other regression/forecasting problem, for which there is very extensive
      literature.

      > 2) Can I get pointers to studies to find optimal values of GP parameters like
      > equation size, population size, crossover and mutation rate, etc.

      In different applications, yes - see the GP bibliography, etc. But you are
      the judge as to whether any of that will apply to your data.

      Assuming that your training dataset is noisy, I would suggest that your next
      step is to produce a noise-free training dataset. You may need to complete
      this by hand. Try training on that and then testing on regular noisy data.
      Then try varying tree depth and complexity, population size, etc., to see
      what effects those have. You will need not only to look at fitness and
      prediction error, but actually to look at data plots. If your data has a lot
      of outliers to which you do not wish fitting, then you may find it better to
      use an absolute deviation in the fitness function, rather than the classical
      squared deviation, which tends to weight towards outliers, of course - these
      issues and others have been covered well in the GP and statistical
      literature.

      I wish you success,
      Regards,
      Howard.

      Dr Howard Oakley
      The Works columnist for MacUser magazine (UK)
      http://www.macuser.co.uk/
      http://www.howardoakley.com/
    • Arjun Chandra
      Hi Nitin, Interesting problem. Yes, it seems like overfitting. Have you considered pruning or even ensembles? You could even put a penalty term in the fitness
      Message 2 of 4 , Sep 2, 2004
      • 0 Attachment
        Hi Nitin,

        Interesting problem. Yes, it seems like overfitting. Have you
        considered pruning or even ensembles? You could even put a penalty
        term in the fitness function which penalises trees if they are very
        deep.

        Cheers,
        Arjun


        --- In genetic_programming@yahoogroups.com, "Nitin Muttil"
        <nitin.muttil@n...> wrote:
        > Dear GP list,
        >
        > I have been trying GP for harmful algal bloom (HAB) predictions. To
        explain what HAB is in brief, it is an explosive growth of algae in
        coastal waters, caused due to dumping pollutants in those waters. HABs
        can be toxic and thus may harm aquatic life and in some cases even
        humans.
        >
        > I am evolving GP models using a training dataset and then testing
        the models on the unseen test data. I am noticing that when the depth
        of GP tree is increased, the model performs better in training, but
        worsens for the test data. I assume this is because of overfitting,
        something similar to overfitting in neural networks, when hidden
        layers/nodes are increased.
        >
        > My questions are:
        >
        > 1) Is this actually overfitting? If so, is there a optimal GP
        equation size, or has it to be fixed by trial and error?
        >
        > 2) Can I get pointers to studies to find optimal values of GP
        parameters like equation size, population size, crossover and mutation
        rate, etc.
        >
        > Thanks very much and any help would be highly appreciated.
        >
        > Best regards,
        > Nitin
      • Benjamin Scott
        Another thing you can do is have about 15 different test datasets. Then use a randomly selected testset to test the GP s against. That way there is not any
        Message 3 of 4 , Sep 3, 2004
        • 0 Attachment
          Another thing you can do is have about 15 different
          test datasets. Then use a randomly selected testset
          to test the GP's against. That way there is not any
          possible way of "overfittness". REMEMBER NOT EVERY
          CREATURE TRAVELS IN THE SAME SHOES.



          _______________________________
          Do you Yahoo!?
          Win 1 of 4,000 free domain names from Yahoo! Enter now.
          http://promotions.yahoo.com/goldrush
        Your message has been successfully submitted and would be delivered to recipients shortly.