Loading ...
Sorry, an error occurred while loading the content.

Re: technical question about custom number of sensors and output nodes

Expand Messages
  • arman.schwarz
    I also noticed that the NETWORK::load_sensors function accepts a double pointer, so NEAT conceivably has no knowledge of the number of inputs that are
    Message 1 of 7 , Sep 5, 2011
    • 0 Attachment
      I also noticed that the NETWORK::load_sensors function accepts a double pointer, so NEAT conceivably has no knowledge of the number of inputs that are available (unless a vector is passed to the function).

      So am I correct in understanding that the rtNEAT package will not try adding inputs beyond those specified in the input file? This would be unfortunate as I might have several hundred inputs and always using all of them in the genome start file would generate considerable computational (as well as human workload) overhead.

      However, contrary to this you have previously mentioned in another post that FSNEAT is supposed by rtNEAT, which does give me some hope. I'm sorry I've been somewhat unclear with my questions (I'm still in the process of getting my head around NEAT itself), but if the addition of sensors is supported, where does this take place in rtNEAT?

      Arman

      --- In neat@yahoogroups.com, "arman.schwarz" <arman.schwarz@...> wrote:
      >
      > Thanks Ken,
      >
      > I tried to create a simple starting genome that simply connects the input to the output;
      >
      > genomestart 1
      > trait 1 1.0 0 0 0 0 0 0 0
      > node 1 1 1 1
      > node 1 0 1 2
      > gene 1 1 1 1.0 0 1 0 1
      > genomeend 1
      >
      > when I allow this to run it does a surprisingly good job at finding a solution, but it doesn't use the other inputs. That is to say, I end up with very complex networks that do a reasonable job of predicting the timeseries (but not as well as they could if they used all the inputs), but none of the nodes, other than the first, are ever input nodes.
      >
      > Supposing that I have 25 inputs, I am creating the following start genes file:
      >
      > genomestart 1
      > trait 1 1 0 0 0 0 0 0 0
      > node 1 0 1 1
      > node 2 0 1 1
      > node 3 0 1 1
      > node 4 0 1 1
      > node 5 0 1 1
      > node 6 0 1 1
      > node 7 0 1 1
      > node 8 0 1 1
      > node 9 0 1 1
      > node 10 0 1 1
      > node 11 0 1 1
      > node 12 0 1 1
      > node 13 0 1 1
      > node 14 0 1 1
      > node 15 0 1 1
      > node 16 0 1 1
      > node 17 0 1 1
      > node 18 0 1 1
      > node 19 0 1 1
      > node 20 0 1 1
      > node 21 0 1 1
      > node 22 0 1 1
      > node 23 0 1 1
      > node 24 0 1 1
      > node 25 0 1 1
      > node 26 0 0 2
      > node 27 0 0 3
      > node 28 0 0 0
      > gene 1 1 26 1 0 1 0 1
      > genomeend 1
      >
      > Will this ensure that NEAT makes use of any available inputs (assuming there are only 25)?
      >
      > Arman
      >
      > --- In neat@yahoogroups.com, "Ken" <kstanley@> wrote:
      > >
      > >
      > >
      > > Hi Arman, nodes.size() is not constant because nodes is a variable-length list, so it can grow or shrink. Asking for its size just gives the current number of nodes in the list. The "inlist" and "outlist" (lists of inputs and outputs internally generated by NEAT) work the same way, so they too can change. Therefore, in principle, I believe that your idea should be feasible. However, you need to make sure that if you ever add new inputs that those new nodes have the right flags marking them as input nodes and sensors.
      > >
      > > There used to be a tutorial that comes with NEAT that explained the file format, but it gone out of date as NEAT was updated, so I believe unfortunately it's no longer included. Let me try to explain a little:
      > >
      > > Traits are reserved for special genetic information that nodes or connections can point to. They are not used in most NEAT experiments. I believe you just need one dummy trait in a genome file trait but it won't be used.
      > >
      > > node 1 0 1 1
      > >
      > > The first number is the node ID #, the second is the trait pointer (which can be left at zero), the third says whether the node is NEURON/SENSOR (0,1), and the fourth is HIDDEN/INPUT/OUTPUT/BIAS (0,1,2,3). It's true that the third and fourth numbers are a little confusing because they seem to be about a similar issue, but that is how they are defined.
      > >
      > > gene 1 1 5 0.0 0 1 0 1
      > >
      > > The parameters are: trait # (not usually used), in_node id, out_node id, weight, is_recurrent flag, innovation_num, mutation_num, and enable flag.
      > >
      > > Note that mutation_num is generally set the same as the weight and does not have a real specific use in NEAT.
      > >
      > > As you may notice some parameters seem unnecessary or redundant, but I initially created this format before I was sure about everything that would be needed and the legacy parameters stuck around, which unfortunately can be confusing.
      > >
      > > ken
      > >
      > > --- In neat@yahoogroups.com, "arman.schwarz" <arman.schwarz@> wrote:
      > > >
      > > >
      > > >
      > > > Thanks Ken,
      > > >
      > > > I think I understand what you mean. So when a new genome is created, NEAT will search through the "nodes" vector to look for candidates. The nodes vector in turn is generated initially by what it finds in the call to this function during the initial creation of the population:
      > > >
      > > > Genome::Genome(int id, std::ifstream &iFile)
      > > >
      > > > Does this mean that nodes.size() is constant and initially constrained to what is given in the genome start file, or does can it adapt depending on the size of the array passed to the "load_sensors" function?
      > > >
      > > > The reason I ask is because I have about 200 inputs, and I would like to create a start file which simply takes the first input as output, and tries to improve based on that, preferably allowing me to use any number of inputs with a single genome start file. Can I do that without destroying NEAT's ability to recognise the existence of all the sensors?
      > > >
      > > > I'm also somewhat confused about the format of those start files, is there any documentation for these files that I can learn from, or will I just need to look through the source code? I understand NEAT's concept of bias nodes, but not so much the idea of "traits", so it would be nice to know if any documentation does exist.
      > > >
      > > > Thanks again for your help.
      > > > Arman
      > > >
      > > > --- In neat@yahoogroups.com, "Ken" <kstanley@> wrote:
      > > > >
      > > > >
      > > > >
      > > > > Hi Arman, from what you wrote I think you understand how to pass in and read out the input and output arrays. I think the main question you are asking is how it knows how long those arrays are.
      > > > >
      > > > > The answer is that rtNEAT (or plain NEAT) C++ counts the number of inputs and number of outputs when it creates a neural network from a genome, which happens in the method
      > > > >
      > > > > Network *Genome::genesis(int id)
      > > > >
      > > > > When it creates the network, it creates separate lists of inputs, hidden nodes, and outputs, as you can see here:
      > > > >
      > > > > //Check for input or output designation of node
      > > > > if (((*curnode)->gen_node_label)==INPUT)
      > > > > inlist.push_back(newnode);
      > > > > if (((*curnode)->gen_node_label)==BIAS)
      > > > > inlist.push_back(newnode);
      > > > > if (((*curnode)->gen_node_label)==OUTPUT)
      > > > > outlist.push_back(newnode);
      > > > >
      > > > > Then it knows how many inputs there are and how many outputs there are because it knows how long those lists are.
      > > > >
      > > > > Going back further in the chain, usually the first place this issue will ultimately be specified is in the starter genome file, which is usually the origin of the genome data structures (which are created from this file). Here is a pole balancing starter genome:
      > > > >
      > > > > genomestart 1
      > > > > trait 1 0.1 0 0 0 0 0 0 0
      > > > > node 1 0 1 1
      > > > > node 2 0 1 1
      > > > > node 3 0 1 1
      > > > > node 4 0 1 3
      > > > > node 5 0 0 2
      > > > > gene 1 1 5 0.0 0 1 0 1
      > > > > gene 1 2 5 0.0 0 2 0 1
      > > > > gene 1 3 5 0.0 0 3 0 1
      > > > > gene 1 4 5 0.0 0 4 0 1
      > > > > genomeend 1
      > > > >
      > > > > Here the numerical codes next the nodes specify what type of nodes they are. In this case, there are 4 inputs (one is a bias) and 1 output. These codes are defined in nnode.h:
      > > > >
      > > > > enum nodetype {
      > > > > NEURON = 0,
      > > > > SENSOR = 1
      > > > > };
      > > > >
      > > > > enum nodeplace {
      > > > > HIDDEN = 0,
      > > > > INPUT = 1,
      > > > > OUTPUT = 2,
      > > > > BIAS = 3
      > > > > };
      > > > >
      > > > > So usually that is from where NEAT ultimately ends up knowing these counts. I hope that helps answer your question.
      > > > >
      > > > > ken
      > > > >
      > > > >
      > > > > --- In neat@yahoogroups.com, "arman.schwarz" <arman.schwarz@> wrote:
      > > > > >
      > > > > > hi everyone,
      > > > > > I am currently trying to apply timeseries prediction to the rtNEAT c++ package, to compare its search performance to a fixed-size, evolving topology recurrent neural net I wrote.
      > > > > >
      > > > > > I have written a custom experiments.h/cpp file to cycle through the activate() function for every time step, after loading an array of ~75 sensor inputs.
      > > > > >
      > > > > > So while I understand how to load the inputs and retrieve the outputs, I don't quite understand how I am supposed to tell NEAT how many inputs and outputs there are. after all, that array I'm passing to load_sensor is nothing more than a pointer to a value.
      > > > > >
      > > > > > so where do I tell NEAT how many inputs I will give each net, and how many outputs I will be expecting? (in my RNN the outputs were simply the last x neurons, but the inputs do need to be defined).
      > > > > >
      > > > > > thanks in advance for the help!
      > > > > >
      > > > >
      > > >
      > >
      >
    • Ken
      Arman, yes, NEAT will only ever use sensors that are included in the starter genome file. However, if those sensors are not connected to the network by a
      Message 2 of 7 , Sep 10, 2011
      • 0 Attachment
        Arman, yes, NEAT will only ever use sensors that are included in the starter genome file. However, if those sensors are not connected to the network by a connection, then they will not actually cause any activation of the network. If NEAT later adds connections from such sensors, then they would start to be used by the network. So in other words, you can implement a kind of FS-NEAT by not connecting the sensors to the network.

        I believe the rtNEAT package has some population initialization options that allows a random subset of sensors (from the total set that are in the starter genome) to be connected into the network in each genome in the initial population.

        In any case, so if you want several hundred inputs to be considered, they would indeed need to be included in the starter file, but they should not cause great computational overhead because they are not necessarily connected into the network itself. However, at the same time, I would also note that hundreds of inputs is a lot for NEAT to consider and it may take a long time for NEAT to figure out which ones to use.

        ken

        --- In neat@yahoogroups.com, "arman.schwarz" <arman.schwarz@...> wrote:
        >
        > I also noticed that the NETWORK::load_sensors function accepts a double pointer, so NEAT conceivably has no knowledge of the number of inputs that are available (unless a vector is passed to the function).
        >
        > So am I correct in understanding that the rtNEAT package will not try adding inputs beyond those specified in the input file? This would be unfortunate as I might have several hundred inputs and always using all of them in the genome start file would generate considerable computational (as well as human workload) overhead.
        >
        > However, contrary to this you have previously mentioned in another post that FSNEAT is supposed by rtNEAT, which does give me some hope. I'm sorry I've been somewhat unclear with my questions (I'm still in the process of getting my head around NEAT itself), but if the addition of sensors is supported, where does this take place in rtNEAT?
        >
        > Arman
        >
        > --- In neat@yahoogroups.com, "arman.schwarz" <arman.schwarz@> wrote:
        > >
        > > Thanks Ken,
        > >
        > > I tried to create a simple starting genome that simply connects the input to the output;
        > >
        > > genomestart 1
        > > trait 1 1.0 0 0 0 0 0 0 0
        > > node 1 1 1 1
        > > node 1 0 1 2
        > > gene 1 1 1 1.0 0 1 0 1
        > > genomeend 1
        > >
        > > when I allow this to run it does a surprisingly good job at finding a solution, but it doesn't use the other inputs. That is to say, I end up with very complex networks that do a reasonable job of predicting the timeseries (but not as well as they could if they used all the inputs), but none of the nodes, other than the first, are ever input nodes.
        > >
        > > Supposing that I have 25 inputs, I am creating the following start genes file:
        > >
        > > genomestart 1
        > > trait 1 1 0 0 0 0 0 0 0
        > > node 1 0 1 1
        > > node 2 0 1 1
        > > node 3 0 1 1
        > > node 4 0 1 1
        > > node 5 0 1 1
        > > node 6 0 1 1
        > > node 7 0 1 1
        > > node 8 0 1 1
        > > node 9 0 1 1
        > > node 10 0 1 1
        > > node 11 0 1 1
        > > node 12 0 1 1
        > > node 13 0 1 1
        > > node 14 0 1 1
        > > node 15 0 1 1
        > > node 16 0 1 1
        > > node 17 0 1 1
        > > node 18 0 1 1
        > > node 19 0 1 1
        > > node 20 0 1 1
        > > node 21 0 1 1
        > > node 22 0 1 1
        > > node 23 0 1 1
        > > node 24 0 1 1
        > > node 25 0 1 1
        > > node 26 0 0 2
        > > node 27 0 0 3
        > > node 28 0 0 0
        > > gene 1 1 26 1 0 1 0 1
        > > genomeend 1
        > >
        > > Will this ensure that NEAT makes use of any available inputs (assuming there are only 25)?
        > >
        > > Arman
        > >
        > > --- In neat@yahoogroups.com, "Ken" <kstanley@> wrote:
        > > >
        > > >
        > > >
        > > > Hi Arman, nodes.size() is not constant because nodes is a variable-length list, so it can grow or shrink. Asking for its size just gives the current number of nodes in the list. The "inlist" and "outlist" (lists of inputs and outputs internally generated by NEAT) work the same way, so they too can change. Therefore, in principle, I believe that your idea should be feasible. However, you need to make sure that if you ever add new inputs that those new nodes have the right flags marking them as input nodes and sensors.
        > > >
        > > > There used to be a tutorial that comes with NEAT that explained the file format, but it gone out of date as NEAT was updated, so I believe unfortunately it's no longer included. Let me try to explain a little:
        > > >
        > > > Traits are reserved for special genetic information that nodes or connections can point to. They are not used in most NEAT experiments. I believe you just need one dummy trait in a genome file trait but it won't be used.
        > > >
        > > > node 1 0 1 1
        > > >
        > > > The first number is the node ID #, the second is the trait pointer (which can be left at zero), the third says whether the node is NEURON/SENSOR (0,1), and the fourth is HIDDEN/INPUT/OUTPUT/BIAS (0,1,2,3). It's true that the third and fourth numbers are a little confusing because they seem to be about a similar issue, but that is how they are defined.
        > > >
        > > > gene 1 1 5 0.0 0 1 0 1
        > > >
        > > > The parameters are: trait # (not usually used), in_node id, out_node id, weight, is_recurrent flag, innovation_num, mutation_num, and enable flag.
        > > >
        > > > Note that mutation_num is generally set the same as the weight and does not have a real specific use in NEAT.
        > > >
        > > > As you may notice some parameters seem unnecessary or redundant, but I initially created this format before I was sure about everything that would be needed and the legacy parameters stuck around, which unfortunately can be confusing.
        > > >
        > > > ken
        > > >
        > > > --- In neat@yahoogroups.com, "arman.schwarz" <arman.schwarz@> wrote:
        > > > >
        > > > >
        > > > >
        > > > > Thanks Ken,
        > > > >
        > > > > I think I understand what you mean. So when a new genome is created, NEAT will search through the "nodes" vector to look for candidates. The nodes vector in turn is generated initially by what it finds in the call to this function during the initial creation of the population:
        > > > >
        > > > > Genome::Genome(int id, std::ifstream &iFile)
        > > > >
        > > > > Does this mean that nodes.size() is constant and initially constrained to what is given in the genome start file, or does can it adapt depending on the size of the array passed to the "load_sensors" function?
        > > > >
        > > > > The reason I ask is because I have about 200 inputs, and I would like to create a start file which simply takes the first input as output, and tries to improve based on that, preferably allowing me to use any number of inputs with a single genome start file. Can I do that without destroying NEAT's ability to recognise the existence of all the sensors?
        > > > >
        > > > > I'm also somewhat confused about the format of those start files, is there any documentation for these files that I can learn from, or will I just need to look through the source code? I understand NEAT's concept of bias nodes, but not so much the idea of "traits", so it would be nice to know if any documentation does exist.
        > > > >
        > > > > Thanks again for your help.
        > > > > Arman
        > > > >
        > > > > --- In neat@yahoogroups.com, "Ken" <kstanley@> wrote:
        > > > > >
        > > > > >
        > > > > >
        > > > > > Hi Arman, from what you wrote I think you understand how to pass in and read out the input and output arrays. I think the main question you are asking is how it knows how long those arrays are.
        > > > > >
        > > > > > The answer is that rtNEAT (or plain NEAT) C++ counts the number of inputs and number of outputs when it creates a neural network from a genome, which happens in the method
        > > > > >
        > > > > > Network *Genome::genesis(int id)
        > > > > >
        > > > > > When it creates the network, it creates separate lists of inputs, hidden nodes, and outputs, as you can see here:
        > > > > >
        > > > > > //Check for input or output designation of node
        > > > > > if (((*curnode)->gen_node_label)==INPUT)
        > > > > > inlist.push_back(newnode);
        > > > > > if (((*curnode)->gen_node_label)==BIAS)
        > > > > > inlist.push_back(newnode);
        > > > > > if (((*curnode)->gen_node_label)==OUTPUT)
        > > > > > outlist.push_back(newnode);
        > > > > >
        > > > > > Then it knows how many inputs there are and how many outputs there are because it knows how long those lists are.
        > > > > >
        > > > > > Going back further in the chain, usually the first place this issue will ultimately be specified is in the starter genome file, which is usually the origin of the genome data structures (which are created from this file). Here is a pole balancing starter genome:
        > > > > >
        > > > > > genomestart 1
        > > > > > trait 1 0.1 0 0 0 0 0 0 0
        > > > > > node 1 0 1 1
        > > > > > node 2 0 1 1
        > > > > > node 3 0 1 1
        > > > > > node 4 0 1 3
        > > > > > node 5 0 0 2
        > > > > > gene 1 1 5 0.0 0 1 0 1
        > > > > > gene 1 2 5 0.0 0 2 0 1
        > > > > > gene 1 3 5 0.0 0 3 0 1
        > > > > > gene 1 4 5 0.0 0 4 0 1
        > > > > > genomeend 1
        > > > > >
        > > > > > Here the numerical codes next the nodes specify what type of nodes they are. In this case, there are 4 inputs (one is a bias) and 1 output. These codes are defined in nnode.h:
        > > > > >
        > > > > > enum nodetype {
        > > > > > NEURON = 0,
        > > > > > SENSOR = 1
        > > > > > };
        > > > > >
        > > > > > enum nodeplace {
        > > > > > HIDDEN = 0,
        > > > > > INPUT = 1,
        > > > > > OUTPUT = 2,
        > > > > > BIAS = 3
        > > > > > };
        > > > > >
        > > > > > So usually that is from where NEAT ultimately ends up knowing these counts. I hope that helps answer your question.
        > > > > >
        > > > > > ken
        > > > > >
        > > > > >
        > > > > > --- In neat@yahoogroups.com, "arman.schwarz" <arman.schwarz@> wrote:
        > > > > > >
        > > > > > > hi everyone,
        > > > > > > I am currently trying to apply timeseries prediction to the rtNEAT c++ package, to compare its search performance to a fixed-size, evolving topology recurrent neural net I wrote.
        > > > > > >
        > > > > > > I have written a custom experiments.h/cpp file to cycle through the activate() function for every time step, after loading an array of ~75 sensor inputs.
        > > > > > >
        > > > > > > So while I understand how to load the inputs and retrieve the outputs, I don't quite understand how I am supposed to tell NEAT how many inputs and outputs there are. after all, that array I'm passing to load_sensor is nothing more than a pointer to a value.
        > > > > > >
        > > > > > > so where do I tell NEAT how many inputs I will give each net, and how many outputs I will be expecting? (in my RNN the outputs were simply the last x neurons, but the inputs do need to be defined).
        > > > > > >
        > > > > > > thanks in advance for the help!
        > > > > > >
        > > > > >
        > > > >
        > > >
        > >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.