## Re: [arcondev] MIST = fog?

Expand Messages
• ... the probabilty distribution is NOT distinctive, which is the whole point! it is .5/.5... and it is very easy to make an artificial system that generates
Message 1 of 9 , Nov 22, 2000
>Hi, I've just been reading the MIST paper, and I suspect that it may be
>fundamentally flawed... <snip>

>Let's assume that you can somehow obtain a distinctive probability
>distribution from intelligent responses, from the stream of bits, - what is
>to stop you artificially making a bit generator that has the the same
>probability distribution? <snip>

the probabilty distribution is NOT distinctive, which is the whole point! it
is .5/.5... and it is very easy to make an artificial system that generates
this probabibility distribution: it's called a coin!

for the record:

A Minimum Intelligent Signal Test is simply the maximum abstraction of the
turing test. it is able to statistically classify an unknown system as
either human, random or evasive and it works as follows:

1: collect and validate a pool (corpus) of binary items (i call them
mindpixels now) that require human intelligence/experience/consciousness to
respond to and that have a stable response. ("water is wet" has a stable
human response of true - "i am male" does not have a stable response - 1/2
say true and 1/2 say false) you will note, this is the whole point of the
mindpixel project - to build the largest corpus of MIST items possible.

2: from your pool, draw a test of at least 20 random items (central limit
theorm) such that 50% have a human response of true, and 50% have a human
response of false.

3: present these items to the unknown system in random order (note that the
probability distribution is .5/.5 - 1/2 true and 1/2 false)

4: calculate the probablity that the system is human, random or evasive.

here's a few examples with test of twenty items
- you can calculate these probabilities yourself here:
http://vassun.vassar.edu/~lowry/zbinom.html

case 1:
-------
n=20 (number of items in test)
k=10 (number of times systems response matched human response)
p=.5 (probability that a random item from the test is true)
q=.5 (probability that a random item from the test is false)

two-tailed p=1.0 the probability that the system is random is 1.0 or 100% -
this is the result you would expect if you flipped a coin to each item.

case 2:
-------
everything is the same except, k=11

p=.83 there is an 83% chance that the system is random - pretty good bet it
is not human!

case 3:
------
k=14

p=.12 there is now just a 12% chance this is a random system.

case 4:
------
k=15

p=.04 woa! big difference! in fact a scientific difference. the probability
that this is a random system (coin) is now less than .05. this is the normal
scientific publication standard. if you had an artificial system that got
15/20 items correct every time you presented it with 20 items randomally
selected from a very lagre pool, you could say scientifically that is is
statistically human.

case 5:
-------
k=20

p= <.0002 actually the probability that you have a random system respond to
all 20 items correctly is 1 in 1,048,576 the exact same probability that you
flip a fair coin heads (or tails) 20 times in a row. the answer is exactly
the same for k=0 - simply because it takes a human to answer every item
incorrectly too (to be evasive) - a coin isn't smart enough to be completely
dumb!

try this:

n=200,000 (this is the number of validated mindpixels i have right now)
k=100,450

p=.04!

100,450 is the minimum score out of 200,000 that science will allow you to
say is statitically siginificant - 450 correct items above brackgroud is the
minimum intelligent signal for a 200,000 item test.

hope this clears up what a MIST is and how it works.

chris.

_____________________________________________________________________________________
• Thanks Chris, I found it a lot clearer put like this. However...I still have major doubts though - let s say you ve got your corpus system with 10^10 stable
Message 2 of 9 , Nov 23, 2000
Thanks Chris, I found it a lot clearer put like this. However...I still have
major doubts though - let's say you've got your corpus system with 10^10
stable questions/responses and you use this to test candidate X. Now (after
a large number of question/responses) you find that candidate X has a
statistically significant set of replies. What if candidate X is just
another (perhaps identical) corpus system - would you say this had passed
the Turing Test? I would have thought (correct me if I'm wrong) that there
was more of a 'conversational' element to the Turing Test, where a human can
over time detect chains of thought in the candidate's responses, whereas the
test you describe could be passed by a machine with a comparable number of
questions/responses hard coded. If the Turing Test *does* reduce to a system
as you describe (I can't ever remember seeing it strictly defined - maybe
you have a good link?), then I would suggest that it isn't a very good test
of intelligence.

Cheers,
Danny.

> -----Original Message-----
> From: Christopher McKinstry [mailto:kcmckinstry@...]
> Sent: 23 November 2000 05:47
> To: arcondev@egroups.com
> Subject: Re: [arcondev] MIST = fog?
>
>
>
> >Hi, I've just been reading the MIST paper, and I suspect that it may be
> >fundamentally flawed... <snip>
>
> >Let's assume that you can somehow obtain a distinctive probability
> >distribution from intelligent responses, from the stream of
> bits, - what is
> >to stop you artificially making a bit generator that has the the same
> >probability distribution? <snip>
>
> the probabilty distribution is NOT distinctive, which is the
> whole point! it
> is .5/.5... and it is very easy to make an artificial system that
> generates
> this probabibility distribution: it's called a coin!
>
> for the record:
>
> A Minimum Intelligent Signal Test is simply the maximum
> abstraction of the
> turing test. it is able to statistically classify an unknown system as
> either human, random or evasive and it works as follows:
>
> 1: collect and validate a pool (corpus) of binary items (i call them
> mindpixels now) that require human
> intelligence/experience/consciousness to
> respond to and that have a stable response. ("water is wet" has a stable
> human response of true - "i am male" does not have a stable
> response - 1/2
> say true and 1/2 say false) you will note, this is the whole point of the
> mindpixel project - to build the largest corpus of MIST items possible.
>
> 2: from your pool, draw a test of at least 20 random items (central limit
> theorm) such that 50% have a human response of true, and 50% have a human
> response of false.
>
> 3: present these items to the unknown system in random order
> (note that the
> probability distribution is .5/.5 - 1/2 true and 1/2 false)
>
> 4: calculate the probablity that the system is human, random or evasive.
>
> here's a few examples with test of twenty items
> - you can calculate these probabilities yourself here:
> http://vassun.vassar.edu/~lowry/zbinom.html
>
> case 1:
> -------
> n=20 (number of items in test)
> k=10 (number of times systems response matched human response)
> p=.5 (probability that a random item from the test is true)
> q=.5 (probability that a random item from the test is false)
>
> two-tailed p=1.0 the probability that the system is random is 1.0
> or 100% -
> this is the result you would expect if you flipped a coin to each item.
>
> case 2:
> -------
> everything is the same except, k=11
>
> p=.83 there is an 83% chance that the system is random - pretty
> good bet it
> is not human!
>
> case 3:
> ------
> k=14
>
> p=.12 there is now just a 12% chance this is a random system.
>
> case 4:
> ------
> k=15
>
> p=.04 woa! big difference! in fact a scientific difference. the
> probability
> that this is a random system (coin) is now less than .05. this is
> the normal
> scientific publication standard. if you had an artificial system that got
> 15/20 items correct every time you presented it with 20 items randomally
> selected from a very lagre pool, you could say scientifically that is is
> statistically human.
>
>
> case 5:
> -------
> k=20
>
> p= <.0002 actually the probability that you have a random system
> respond to
> all 20 items correctly is 1 in 1,048,576 the exact same
> probability that you
> flip a fair coin heads (or tails) 20 times in a row. the answer
> is exactly
> the same for k=0 - simply because it takes a human to answer every item
> incorrectly too (to be evasive) - a coin isn't smart enough to be
> completely
> dumb!
>
> try this:
>
> n=200,000 (this is the number of validated mindpixels i have right now)
> k=100,450
>
> p=.04!
>
> 100,450 is the minimum score out of 200,000 that science will
> allow you to
> say is statitically siginificant - 450 correct items above
> brackgroud is the
> minimum intelligent signal for a 200,000 item test.
>
> hope this clears up what a MIST is and how it works.
>
> chris.
>
>
>
>
> __________________________________________________________________
> ___________________
http://explorer.msn.com

Artificial Consciousness Development List Recommended Reading:
==============================================================
Alan Turing: The Enigma (May 2000 Edition)
Andrew Hodges
Amazon \$14.36
http://www.amazon.com/exec/obidos/ASIN/0802775802/mindpixecorporat
==============================================================
The Symbolic Species : The Co-Evolution of Language and the Brain
Terrence W. Deacon
Amazon \$12.76
http://www.amazon.com/exec/obidos/ASIN/0393317544/mindpixecorporat
==============================================================
Understanding Language Understanding : Computational Models of Reading
Ashwin Ram and Kenneth Moorman (eds.)
Amazon \$50.00
http://www.amazon.com/exec/obidos/ASIN/0262181924/mindpixecorporat
==============================================================
Foundations of Statistical Natural Language Processing
Christopher Manning, Hinrich Schutze
Amazon \$60.00
http://www.amazon.com/exec/obidos/ASIN/0262133601/mindpixecorporat
==============================================================
Rethinking Innateness : A Connectionist Perspective on Development
Jeffrey L. Elman et al
Amazon \$20.00
http://www.amazon.com/exec/obidos/ASIN/026255030X/mindpixecorporat
==============================================================
Exercises in Rethinking Innateness : A Handbook for Connectionist
Simulations
Jeffrey L. Elman et al
Amazon \$45.00
http://www.amazon.com/exec/obidos/ASIN/0262661055/mindpixecorporat
==============================================================

I strongly recommend the above books to everyone on this list.
(Note: The Amazon commission goes directly to fund the
MindPixel/ai@home Project) Chris McKinstry, Arcondev Moderator
http://www.mindpixel.com/chris
• ... the problem with the turing test is its conversational style - it is difficult to decide if a response is intelligent or not - the judgement is subjective,
Message 3 of 9 , Nov 23, 2000
>Thanks Chris, I found it a lot clearer put like this. However...I still
>have
>major doubts though - let's say you've got your corpus system with 10^10
>stable questions/responses and you use this to test candidate X. Now (after
>a large number of question/responses) you find that candidate X has a
>statistically significant set of replies. What if candidate X is just
>another (perhaps identical) corpus system - would you say this had passed
>the Turing Test? I would have thought (correct me if I'm wrong) that there
>was more of a 'conversational' element to the Turing Test, where a human
>can
>over time detect chains of thought in the candidate's responses, whereas
>the
>test you describe could be passed by a machine with a comparable number of
>questions/responses hard coded. If the Turing Test *does* reduce to a
>system
>as you describe (I can't ever remember seeing it strictly defined - maybe
>you have a good link?), then I would suggest that it isn't a very good test
>of intelligence.
>
>Cheers,
>Danny.
>

the problem with the turing test is its conversational style - it is
difficult to decide if a response is intelligent or not - the judgement is
subjective, non-scientific and varies from person to person. and yes, it's
possible that the corpus is just testing a copy of itself (you can cheat at
any test if you have all the answers), and this possibility would have to be
eliminated after the test - you wouldn't be able to publish the results
unless you could prove that you didn't cheat by using a copy of the corpus
(see below).

the greatest advantage we get by doing it the MIST way over the turing test
way is that we can build a large corpus of items with known valid answers
and build an automated test - a test which we can apply against an evolving
artificial entity billions and even trillions of times until it learns to
get the responses right. you simply can't do this with a turing test because
you need live people.

so, if i have a billion item MIST, and i pick a random 900 million items
from the corpus and test a population of artificial evolving entities (say
some sort of neural net specified by a genetic algorithm) - in the first
generation they will all be statisticially random, but if the population is
big enough (again 20 or more - central limit theorm) there will be some that
perform, by CHANCE just a little better than all the others. we breed the
best ones together and discard the rest.

now we pick repeat the process, over and over. over time, the performance
will get better and better and at some point you will discover an entity
that performs better than the .05 level - it's not as intelligent as a
human, but is not random - it does know something about being a human in a
very general way.

ok, so after billions and maybe even trillions of individual tests (try that
with a turing test) you got something that seems to have learned something
from the corpus. but how do we know that it didn't just memorize a part of
the corpus? well, that's why we held back 100 million items from the 1
billion - as a generalization test. if we present the 100 million items to
the entity which we can prove it has NEVER seen and if it performs at the
same .05 level then we can say with scientific certianty that we have
detected minimum statistical consciousness in an artificial entity.

this would mean that the internal structure of the entity has evolved
important structure in common with humans and learning should now be much
quicker. if we keep training we should get to maximum statistical
consciousness in short order.

hope this helps.

chris

_____________________________________________________________________________________
• On Thu, Nov 23, 2000 at 07:51:05PM +0000, Christopher McKinstry wrote: [...] ... [...] On the other hand it s the only widely accepted way of testing for
Message 4 of 9 , Nov 23, 2000
On Thu, Nov 23, 2000 at 07:51:05PM +0000, Christopher McKinstry wrote:
[...]
> the problem with the turing test is its conversational style - it is
> difficult to decide if a response is intelligent or not - the judgement is
> subjective, non-scientific and varies from person to person. and yes, it's
[...]

On the other hand it's the only widely accepted way of testing for
intelligence and has strong intuitive appeal. Nor is it as arbitrary
as it appears at first sight - null tests are common in physics (the
wheatstone bridge for example) and this is simply a null test (a test
for "equality") with the only thing we know to be intelligent -
ourselves.

> ok, so after billions and maybe even trillions of individual tests (try that
> with a turing test) you got something that seems to have learned something

Because it is not possible to train via a Turing test is an advantage,
not a criticism. It helps reduce the chance of making the error that
you make below...

> from the corpus. but how do we know that it didn't just memorize a part of
> the corpus? well, that's why we held back 100 million items from the 1
> billion - as a generalization test. if we present the 100 million items to
> the entity which we can prove it has NEVER seen and if it performs at the
> same .05 level then we can say with scientific certianty that we have

This is no more "intelligent" than a neural net classifying letters is
intelligent. Of course, you can argue that a neural net does show
some limited intelligence, but it is hardly the kind of intelligence
that would pass a Turing test and so is not comparable with human
intelligence.

Two further points:

- first you can say that "human intelligence" is not the target. But
then why not argue that computers are already intelligent, just in a
different way from humans? The answer, of course, is that the only
solid ground on which to build a test is human intelligence, because
that is the only intelligence of which we have intuitive knowledge -
which brings us back to the Turing test.

- second, you can argue that it is simply a matter of degree, and that
your system will be much more complex, allowing a more complex
intellignce to emerge. But there does not appear to be any way within
the system to track objects within conversation, as the original
poster first observed, so it clearly can be distinguished from human
intelligence. So no matter how complex the intelligence that does
emerge, it is clearly limited (in at least one trivial way). You try
to avoid this by abusing the Turing test and trying to choose your own
targets. Hardly objective.

How is this system any more intelligent than a natural language
interface to an encyclopedia? And we can already do that!

Andrew

--
http://www.andrewcooke.free-online.co.uk/index.html
• ... The information would be human, but I don t think that implies humanlike thought processes. For example, if the mindpixels were concerned with an entirely
Message 5 of 9 , Nov 23, 2000
> this would mean that the internal structure of the entity has evolved
> important structure in common with humans and learning should now be much
> quicker. if we keep training we should get to maximum statistical
> consciousness in short order.

The information would be human, but I don't think that implies
humanlike thought processes. For example, if the mindpixels were
concerned with an entirely different domain, but were still
consistent within that domain (for simplicity, perhaps the negation
of every current mindpixel), then the evolved intelligence would use
the same analytical processes as this proposed one, therefore
must be considered equally conscious, and yet would fail a human-
administered MIST. Which is what you'd expect. It fails a MIST
and apparently doesn't know shit, so why would its reasoning be
human? And so would you call something with identical reasoning
human just because it passed a MIST? Or to rephrase that
question: is the quality of mindly information the sole distinguishing
factor between different conscious species?

Anyway, big danger of confusion here. (Too late for me, I fear.) I'm
certain that statistical consciousness should be recoined
statistical humanity, at the very least.

Bob
• ... all i am is a natural language interface to the enyclopedia of me... my intelligence come from the fact that my encyclopedia is capable of answering
Message 6 of 9 , Nov 23, 2000
<snip>

>How is this system any more intelligent than a natural language
>interface to an encyclopedia?  And we can already do that!
>

all i am is a natural language interface to the enyclopedia of me... my
intelligence come from the fact that my encyclopedia is capable of answering
questions to which it has no direct answer. it has instead, an active model
of self and environment that allows it to generalize based on massive
amounts of common sense.

it's not just the raw facts, but how they are interconnected that makes us
what we are.

now, i've said this before, only about a million times, but i'll say it
again. the process of training a statistical learning system with mindpixels
is hypertomography.  it is the high dimensional analog of normal tomography
that gives us CT and MRI images. the math is EXACTLY the same - just in more
dimensions. in the same way that a collection of MRI images of you brain
hypertomography will correspond to your mind.

most people have a lot of trouble visualizing high dimensional images,
myself included. instead, one would usually use a lower dimensional
projection. such as a self-organizing map, cluster, or tree diagram. if you
want to see what a hypertomograph of the human mind would  look like, take a
look at elman's 'finding structure in time' - there you will see a tree
built from mindpixel-like statements in an artificial language (he used an
artificial language because at the time this research was done (1990) there
was no such thing as the mindpixel corpus)

elman's neural net dircovered lexical class - concept clusters just from the
statistics of the artificial language.

you can expect the same kind of thing from training a neural network using
mindpixels. a map of the average human mind at the conceptual level. but in
this case, the map is the territory. in the same way i could leave out an
x-ray sample in a CT scan, and reconstruct what is should have been using
all the sample i did take, i can figure out what the response to an unknown
mindpixel should be based on all the mindpixel i do know. and i don't have
to invent any new math to do it. all i need is a VERY large sample of
mindpixels.

the key to understanding this is to see a mindpixel not as a statement of
fact, but as the analog of a tomographic x-ray sample.

when you make an x-ray CT scan, you take millions of samples from different
posistions arround the object being imaged. you end up with a giant database
that looks something like this:

angle, translation, intensity
-------------------
0,0,.8
.
.
.
10,5,.3
.
.
.
359,20,.9

where the angle and translation are the physical coordinates of the x-ray
emitter head, and the intensity is the value of the x-ray intensity in a ray
passing through the object from those coordinates.

mindpixel, response
--------------
a...aaaa, .5
.
.
.
are you alive?, 1.0
.
.
.
do you believe in god?, .5
.
.
.
zoos are for keeping pizza's warm.,0
.
.
.
z...zzzz,.5

don't see the mindpixel as human information, see it only as a coordinate in
a very high dimensional space (each character is a dimension), and the
response as the analog as an x-ray intensity.

as in the CT scan, the key to the quality of the reconstructed image is the
size and distribution of the sample it was built from.

chris
• Hi Chris, Thanks for some more grand responses! ... Agreed, but they are dynamically interconnected - even if you did lots of tomographs over time, it would be
Message 7 of 9 , Nov 24, 2000
Hi Chris,
Thanks for some more grand responses!

first, on the tomography point :

> it's not just the raw facts, but how they are interconnected that makes us
> what we are.

Agreed, but they are dynamically interconnected - even if you did lots of
tomographs over time, it would be very difficult to model the physical brain
based on this data, and recreate the *processes* that caused the patterns in
the data.
I reckon you *will* get useful conceptual maps (and a good natural language
interface, as Andrew mentioned), though I think the mindpixel idea will
probably fall short of intelligence because it is static - you talk yourself
of stable values, one of the characteristics of intelligence is the ability
to learn - this system will cease to learn once the values are stable!
It would be interesting to destabilize such a system though - if for example
you cut out a fair sized of it's knowledge, I would say you were getting
somewhere if it autonomously recreated that area.

I quite like Andrew's Wheatstone bridge, and some of his arguments about the
TT, but I have to query "... the only thing we know to be intelligent -
ourselves." Do I read this as meaning I can only safely assume that I
personally have intelligence? Or that it can be assumed that observation is
enough to suggest (most!) people have this quality called intelligence? In
either case, how you deal non-human biological intelligence - say it doesn't
exist as we are only defining intelligence as human intelligence? Artificial
intelligence is then also a non-starter. This definition of intelligence is
a tricky area, maybe it's a sign of intelligence that we can argue about the
subject without having much more than individual interpretations of the
term...

...........
Chris :> ok, so after billions and maybe even trillions of individual tests
(try that
> with a turing test) you got something that seems to have learned something

Andrew : Because it is not possible to train via a Turing test is an
not a criticism.
.........

Isn't that in a way that what the mindpixel project is about - educating a
machine using the Turing Test stretched thin?

In any case, why shouldn't the responses be as long as the questions? Ok,
simplifying the response to one bit will undoubtedly make pre-processing a
information which would be just as important to the model as the string
given for the question. Why not go for 'True Colour' mindpixels, and take
advantage of more parallel information acquisition?

Chers,
Danny.
• ... Of course you know that statement is simply opinion. I would argue that there is much more that makes up the mind that cannot be qualified in terms of
Message 8 of 9 , Nov 26, 2000
At 08:52 PM 2000-11-23 -0400, you wrote:

> >How is this system any more intelligent than a natural language
> >interface to an encyclopedia? And we can already do that!
> >
>
>all i am is a natural language interface to the enyclopedia of me...

Of course you know that statement is simply opinion. I would argue that
there is much more that makes up the "mind" that cannot be qualified in
terms of mindpixels as they are currently defined. For example, emotion,
instinct, and dreams are all part of what makes our mind what it is, yet
none of these things can truly be expressed in language. There is always
that element of emotion that is "more than words could ever say", there's
the reaction beyond explanation (instinct).

>my
>intelligence come from the fact that my encyclopedia is capable of answering
>questions to which it has no direct answer. it has instead, an active model
>of self and environment that allows it to generalize based on massive
>amounts of common sense.

... and other things. If this world ran on common sense, it would be a
much better place than it is.

>it's not just the raw facts, but how they are interconnected that makes us
>what we are.

No, it's the facts and experiences and how they connect that makes up our
total experience and our knowledge base. It's the other factors that make
us what we are.

>now, i've said this before, only about a million times, but i'll say it
>again. the process of training a statistical learning system with mindpixels
>is hypertomography. it is the high dimensional analog of normal tomography
>that gives us CT and MRI images. the math is EXACTLY the same - just in more
>dimensions. in the same way that a collection of MRI images of you brain
>hypertomography will correspond to your mind.

I don't agree. The result of the statistical regression (I think I can use
"regression" here... haven't done stats for some time) will correspond to
the part of my mind that is rationally presented through a language
interface. There will be essential parts of me missing from that mapping.

In other words, MIST very accurately tests whether or not a system has a
knowledge level similar to that of a (?many) human(s). But there is
something the Turing Test does that MIST does not: MIST has no method of
testing whether the system can carry a train of thought, respond in an
appropriate manner when confronted with an opinion that differs from its
own (whether that be an emotional outburst or a rational argument), or
develop a complex solution to a problem; all of which I believe to be part
of an intelligence test.

In fact, I would be more likely to accept a being as intelligent if it was
able to demonstrate the three things I've listed above without any sort of
knowledge than the other way around.

A knowledge base without a driving force is simply a knowledge base... and
as Andrew said, we can already make knowledge bases with NL interfaces.

Eric
Your message has been successfully submitted and would be delivered to recipients shortly.