Neural Net Capabilities
- AIMA 2nd edition page 744 claims "In fact, with a single, sufficiently
large hidden layer, it is possible to represent any continuous function
of the inputs with arbitrary accuracy,".
What is sufficiently large. Can any one provide sample code or pseudo
code for a network that would learn the function Z = X*Y for random
inputs X and Y in the range 0..100. Training values would be exact to
the limits of precision of the machine. Output Z should be accurate to
within 0.01 or 0.1 % of the functional value whichever is greater?
It appears to me that the neural units do fine in computing sum of
functions of X or Y, but not products. Have I missed something?
- I'm still intersted in Neural Net Capabilities. See post 253
Can any one provide a constructive proof of the claim from page 744 of
AIMA 2nd edition, "In fact, with a single, sufficiently large hidden
layer, it is possible to represent any continuous function of the
inputs with arbiarary accuracy" or alternatively a net that is able to
learn z = x*y over the range -10 < x, y <10 with an accuracy of
abs(z-xy)<0.1? or more difficultly compute x,y from r and theta where
x is r * cos(theta) and y is r * sin(theta)?
I doubt the solution will involve any learning at all. I guess instead
that it will show how to construct the net directly (or show that
3-layer nets are Turing complete and depend on the problem's answer
being Turing-computable). It's an interesting question though whether
it is possible to learn any such approximation.
I'm not sure if the general solution looks like the following. Sorry
that I don't use more standard notation. I don't quite have the hang of
(I'm somewhat unsure of the math in the next paragraph.)
Assume that the each of the input values x1, x2, ..., xn is drawn from
a bounded interval I1, I2, ... In, each Ii a subinterval of R. Assume
also that you want the output to be within some epsilon > 0 of the
value of some everywhere continuous, n-ary function f: R x R x .. R -->
R. Since the function is everywhere continuous, it's possible to divide
the entire domain (I1 x I2 x .. In) into finitely many sections C1, C2,
... Ck such that for each section there exists some Li such that f is
within epsilon of Li everywhere within the section. (Note each section
Ci is the Cartesian product of n subintervals.)
To build a 3-layer net to approximate f within epsilon: (I assume that
each neuron may have its own threshold level. If the weighted sum of
the inputs to the neuron is less than this level, it outputs zero;
otherwise, its output is its activation value.)
1) For each of the sections, Ci, with corresponding output value, Li,
add an output node labeled Fi, with activation value Li and threshold
2) For each input variable xi, add an input node labeled Xi. Take the
subintervals corresponding to variable xi in the sections C1, C2, ...,
Ck. Without loss of generality, assume that none of the subintervals
overlaps another. Let ci be the lower bound of the interval
corresponding to xi in Ci. Add a hidden unit labeled Hi with threshold
ci and activation value 1. Add a connection with weight 1 from Xi to
Hi. Add a connection with weight 1 from Hi to Fi.
3) For each hidden unit Hi with threshold ci, add a connection to any
output node Fj if the threshold, cj, of the corresponding hidden unit
Hj is less than ci. Make the weight of this connection -1.
(We could've made all of the neurons in the above have the same
threshold, but the description was complicated enough already.)
I'd also be interested in seeing the solution to the problem posed by
George if someone has it. Thanks.
On Dec 20, 2003, at 9:30 AM, georgel360 wrote:
> I'm still intersted in Neural Net Capabilities. See post 253
> Can any one provide a constructive proof of the claim from page 744 of
> AIMA 2nd edition, "In fact, with a single, sufficiently large hidden
> layer, it is possible to represent any continuous function of the
> inputs with arbiarary accuracy" or alternatively a net that is able to
> learn z = x*y over the range -10 < x, y <10 with an accuracy of
> abs(z-xy)<0.1? or more difficultly compute x,y from r and theta where
> x is r * cos(theta) and y is r * sin(theta)?