Loading ...
Sorry, an error occurred while loading the content.

A serious logical error in chapter 2.

Expand Messages
  • aydogan_ozgur
    In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of counting the number of cleaning a square by claiming that the cleaner,in which case, can
    Message 1 of 5 , Jun 21, 2008
    View Source
    • 0 Attachment
      In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of
      counting the number of cleaning a square by claiming that the
      cleaner,in which case, can successionly clean and DUMP the dirt of
      the same square, which is clearly worthless.

      However, there is a nontrivial glitch in this proposition. The Actions
      of the cleaner are

      to move right,
      to move left,
      to clean or
      to do nothing.

      In other words, there is no DUMP action, which Mr. Norvig uses to
      refute the proposition. So, I think, he can't just make up another
      action to refute the proposition. I hope, I am not exagerrating the case?

      Thanks for reading.
    • Nima Talebi
      I think the mention of `dump , not being in the action list of the agent, as you correctly stated, in this environment at least serves no purpose. I think
      Message 2 of 5 , Jun 21, 2008
      View Source
      • 0 Attachment
        I think the mention of `dump', not being in the action list of the agent, as you correctly stated, in this environment at least serves no purpose.

        I think (please correct me if I'm wrong), Dr. Norvig mentions the action of a `dump' only as a means of driving a point home - If allowed to be the designer of it's own performance measure, it could easily make life easy for itself and, as an *example*, enter a loop of clean-->dump-->clean-->dump... and satisfy itself that it is performing extremely well, while in reality, it's doing nothing.

        Here is another example I've come across which may help...

        There was a robotics experiment (reinforcement learning) where a robot had a ball-mouse attached to it behind, and it was `rewarded' for how much the ball rolled *forward*, then allowed to experiment in a room... hopefully to learn to move as fast as possible by itself.

        It was left to explore overnight, and in the morning, it was found sitting at an area in the room where there was a lump on the ground... and what the agent was found doing was this...

        Best way to draw a picture is this - imagine a dog rubbing it's behind on the carpet as to scratch it.

        In effect - that is what it had learned to do. It was (incorrectly) being rewarded because the series of motions it had learned and was carrying was indeed honoring (cheating) the reward system...
         * the ball was rolling forward faster than ever before.

        This is analogous to the vacuum cleaner taking a dump and sucking it backup.  I hope I've answered your question? =)

        Nima

        On Sun, Jun 22, 2008 at 10:58 AM, aydogan_ozgur <aydogan_ozgur@...> wrote:

        In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of
        counting the number of cleaning a square by claiming that the
        cleaner,in which case, can successionly clean and DUMP the dirt of
        the same square, which is clearly worthless.

        However, there is a nontrivial glitch in this proposition. The Actions
        of the cleaner are

        to move right,
        to move left,
        to clean or
        to do nothing.

        In other words, there is no DUMP action, which Mr. Norvig uses to
        refute the proposition. So, I think, he can't just make up another
        action to refute the proposition. I hope, I am not exagerrating the case?

        Thanks for reading.


      • ozgur aydogan
        Hi thanks for the answer, but , frankly, I know what Mr. Norvig wants to point by that exceptional action of the cleaner. And, I see your example as well. But
        Message 3 of 5 , Jun 22, 2008
        View Source
        • 0 Attachment
          Hi thanks for the answer, but , frankly, I know what Mr. Norvig wants to point by that exceptional action of the cleaner. And, I see your example as well. But there is a little difference between yours and the one I am mentioning about; your robot carried out the actions you stated at the beginning of your experiment. It went through a direction[must be only action you defined for it?] looking for the maximum goal you had described. Thus, it only carried out the action you wanted it to do[and that scratching dog like position was a result of your goal and action definitions.]

          if you excuse me, I want to exemplify my question:

          Firstly, let's forget about the agent and just focus on the actions of it to abstract the example. Here are the actions:

          A, B, C, D and those are mutually exclusive and none of them has side effects[like DUMPing].  how would you get an extra action [say it E] out of this four ones? you can't unless you implied it[Unlike your robotic example, It only carried out the actions you described initially]. Although this looks trivial issue, it lacks information and bad example for an introductory example. Lastly, If I added an additional action of a Truth Table during the exam, my teacher would definitely have me failed :S

          Thanks again.





          --- 22/06/08 Pzr tarihinde Nima Talebi <nima.talebi@...> şöyle yazıyor:
          Kimden: Nima Talebi <nima.talebi@...>
          Konu: Re: [aima-talk] A serious logical error in chapter 2.
          Kime: aima-talk@yahoogroups.com
          Tarihi: 22 Haziran 2008 Pazar, 5:42

          I think the mention of `dump', not being in the action list of the agent, as you correctly stated, in this environment at least serves no purpose.


          I think (please correct me if I'm wrong), Dr. Norvig mentions the action of a `dump' only as a means of driving a point home - If allowed to be the designer of it's own performance measure, it could easily make life easy for itself and, as an *example*, enter a loop of clean-->dump-->clean-->dump... and satisfy itself that it is performing extremely well, while in reality, it's doing nothing.

          Here is another example I've come across which may help...

          There was a robotics experiment (reinforcement learning) where a robot had a ball-mouse attached to it behind, and it was `rewarded' for how much the ball rolled *forward*, then allowed to experiment in a room... hopefully to learn to move as fast as possible by itself.

          It was left to explore overnight, and in the morning, it was found sitting at an area in the room where there was a lump on the ground... and what the agent was found doing was this...

          Best way to draw a picture is this - imagine a dog rubbing it's behind on the carpet as to scratch it.

          In effect - that is what it had learned to do. It was (incorrectly) being rewarded because the series of motions it had learned and was carrying was indeed honoring (cheating) the reward system...
           * the ball was rolling forward faster than ever before.

          This is analogous to the vacuum cleaner taking a dump and sucking it backup.  I hope I've answered your question? =)

          Nima

          On Sun, Jun 22, 2008 at 10:58 AM, aydogan_ozgur <aydogan_ozgur@ yahoo.com> wrote:

          In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of
          counting the number of cleaning a square by claiming that the
          cleaner,in which case, can successionly clean and DUMP the dirt of
          the same square, which is clearly worthless.

          However, there is a nontrivial glitch in this proposition. The Actions
          of the cleaner are

          to move right,
          to move left,
          to clean or
          to do nothing.

          In other words, there is no DUMP action, which Mr. Norvig uses to
          refute the proposition. So, I think, he can't just make up another
          action to refute the proposition. I hope, I am not exagerrating the case?

          Thanks for reading.




          Yahoo! kullaniyor musunuz?
          Istenmeyen postadan biktiniz mi? Istenmeyen postadan en iyi korunma Yahoo! Posta'da
          http://tr.mail.yahoo.com
        • daashmashty
          You re argument is that dumping is not in the action list - but that s not the point that s being made here. The key that is being conveyed here is this... As
          Message 4 of 5 , Jun 23, 2008
          View Source
          • 0 Attachment
            You're argument is that dumping is not in the action list - but that's not the point that's
            being made here.

            The key that is being conveyed here is this...

            "As a general rule, it is better to design a performance measure according to what one
            actually wants the in the environment, rather than according to how one thinks the agent
            should behave"

            ----

            In the earlier example I gave, the flaw was a good example, as rather than looking at "How
            much has the agent travelled?", they were looking at "How much has the ball rolled"... and
            the robot discovered an honest solution to make the ball roll faster, which was not what
            the designers were looking for.

            Analogously here, we're saying that the agent could find a similar `flaw' (from the
            designer's perspective) in the reward system, and continually dump and suck - this is just
            an observation... We are hypothesizing... as in "Imagine if..."
            ...imagine *if* the robot could *dump*, then you'd see that this suck-dump-loop could be
            a possibility, and to not allow that to happen, so we need to be careful about how the
            reward system is set up.

            I hope that clarifies it.

            Nima

            --- In aima-talk@yahoogroups.com, ozgur aydogan <aydogan_ozgur@...> wrote:
            >
            > Hi thanks for the answer, but , frankly, I know what Mr. Norvig wants to point by that
            exceptional action of the cleaner. And, I see your example as well. But there is a little
            difference between yours and the one I am mentioning about; your robot carried out the
            actions you stated at the beginning of your experiment. It went through a direction[must
            be only action you defined for it?] looking for the maximum goal you had described. Thus,
            it only carried out the action you wanted it to do[and that scratching dog like position was
            a result of your goal and action definitions.]
            >
            > if you excuse me, I want to exemplify my question:
            >
            > Firstly, let's forget about the agent and just focus on the actions of it to abstract the
            example. Here are the actions:
            >
            > A, B, C, D and those are mutually exclusive and none of them has side effects[like
            DUMPing].  how would you get an extra action [say it E] out of this four ones? you can't
            unless you implied it[Unlike your robotic example, It only carried out the actions you
            described initially]. Although this looks trivial issue, it lacks information and bad example
            for an introductory example. Lastly, If I added an additional action of a Truth Table during
            the exam, my teacher would definitely have me failed :S
            >
            > Thanks again.
            >
            >
            >
            >
            >
            > --- 22/06/08 Pzr tarihinde Nima Talebi <nima.talebi@...> şöyle yazıyor:
            > Kimden: Nima Talebi <nima.talebi@...>
            > Konu: Re: [aima-talk] A serious logical error in chapter 2.
            > Kime: aima-talk@yahoogroups.com
            > Tarihi: 22 Haziran 2008 Pazar, 5:42
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            > I think the mention of `dump', not being in the action list of the agent, as you
            correctly stated, in this environment at least serves no purpose.
            > I think (please correct me if I'm wrong), Dr. Norvig mentions the action of a `dump' only
            as a means of driving a point home - If allowed to be the designer of it's own performance
            measure, it could easily make life easy for itself and, as an *example*, enter a loop of
            clean-->dump-->clean-->dump... and satisfy itself that it is performing extremely well,
            while in reality, it's doing nothing.
            >
            > Here is another example I've come across which may help...
            > There was a robotics experiment (reinforcement learning) where a robot had a ball-
            mouse attached to it behind, and it was `rewarded' for how much the ball rolled
            *forward*, then allowed to experiment in a room... hopefully to learn to move as fast as
            possible by itself.
            >
            > It was left to explore overnight, and in the morning, it was found sitting at an area in
            the room where there was a lump on the ground... and what the agent was found doing
            was this...
            >
            > Best way to draw a picture is this - imagine a dog rubbing it's behind on the carpet as to
            scratch it.
            > In effect - that is what it had learned to do. It was (incorrectly) being rewarded because
            the series of motions it had learned and was carrying was indeed honoring (cheating)
            the reward system...
            >  * the ball was rolling forward faster than ever before.
            > This is analogous to the vacuum cleaner taking a dump and sucking it backup.  I
            hope I've answered your question? =)
            >
            > Nima
            >
            > On Sun, Jun 22, 2008 at 10:58 AM, aydogan_ozgur <aydogan_ozgur@ yahoo.com>
            wrote:
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            > In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of
            >
            > counting the number of cleaning a square by claiming that the
            >
            > cleaner,in which case, can successionly clean and DUMP the dirt of
            >
            > the same square, which is clearly worthless.
            >
            >
            >
            > However, there is a nontrivial glitch in this proposition. The Actions
            >
            > of the cleaner are
            >
            >
            >
            > to move right,
            >
            > to move left,
            >
            > to clean or
            >
            > to do nothing.
            >
            >
            >
            > In other words, there is no DUMP action, which Mr. Norvig uses to
            >
            > refute the proposition. So, I think, he can't just make up another
            >
            > action to refute the proposition. I hope, I am not exagerrating the case?
            >
            >
            >
            > Thanks for reading.
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            >
            > ___________________________________________________________________
            > Yahoo! kullaniyor musunuz? http://tr.mail.yahoo.com
            > Istenmeyen postadan biktiniz mi? Istenmeyen postadan en iyi korunma
            > Yahoo! Posta'da
            >
          • ozgur aydogan
            Hi Nami, again. I think you are mixing the point I am emphasizing and Mr. Norvig does. I definitely agree with the Mr. Norvig s performance measurement
            Message 5 of 5 , Jun 24, 2008
            View Source
            • 0 Attachment

              Hi Nami, again. I think you are mixing the point I am emphasizing and Mr. Norvig does. I definitely agree with the Mr. Norvig's performance measurement criteria:

              "As a general rule, it is better to design a performance measure according to what one
              actually wants the in the environment, rather than according to how one thinks the agent
              should behave"

              I think I didn't say something against to this criteria. however, let's delve into this.

              I want to change the enviroment, yes this is my goal.

              Q - and what  do I have to change the environment?
              A - I have my agent

              Q - What does my agent have to change the environment?
              A - It has actions to change the environment.

              Q - What actions does my agent have to change the environment ?
              A - It has A,B,C and D.

              Q - Can my agent have another actions to change the environment other than the ones I specified above? for example, an E action.
              A -  No, it can't.

              Q - So, should I specify my goal according to my actions so that the environment will change the way that maximizes my expactation/goal?
              A - Yes, Exactly.

              Q - But wait, there might appear some side effects stemming from those 4 actions.
              A - In that case you should have specified it among your actions.

              As It is seen, I didn't claim anything against that performance measure criteria. so giving it as example to refute my claim is kind of off the topic.

              [[ In the earlier example I gave, the flaw was a good example, as rather than looking at "How
              much has the agent travelled?", they were looking at "How much has the ball rolled"... and ]]

              you should consider your agent's afore specified actions , not its human point of view implications; in either case [say it travelled or rolled the ball] your agent was *MOVING* [definition: to excitite  engine with some current]  to some direction to maximize your goal [ *MOVE* forward as much as possible ]. In this example, as i said before, your agent only carried out the actions you stated before, it *MOVED* and found a lump to maximize its goal. however, if it had carried out another action that would be analogous to my example; such as flying, or bouncing [which you hadn't specified before the experiment].

              This example of Mr norvig is similar to teacher someone to play football with this action:

              Action:
              You can take the ball anywhere you want excecpt the border lines.
              You will use ONLY your feet to take the ball.

              Goal:
              Throw the ball opposite team's goalpost as much as possible, do not let them do the same thing and do not hurt them.

              Result:
              Players use ONLY their feet -as expected - and you punish them since they don't use their HEADs  to score-which you didn't include in the action list. This is so ridicilous, teacher should have told the rules better; this is his own fault not the players.




              --- 23/06/08 Pzt tarihinde daashmashty <nima.talebi@...> şöyle yazıyor:
              Kimden: daashmashty <nima.talebi@...>
              Konu: [aima-talk] Re: A serious logical error in chapter 2.
              Kime: aima-talk@yahoogroups.com
              Tarihi: 23 Haziran 2008 Pazartesi, 11:52

              You're argument is that dumping is not in the action list - but that's not the point that's
              being made here.

              The key that is being conveyed here is this...

              "As a general rule, it is better to design a performance measure according to what one
              actually wants the in the environment, rather than according to how one thinks the agent
              should behave"

              ----

              In the earlier example I gave, the flaw was a good example, as rather than looking at "How
              much has the agent travelled?", they were looking at "How much has the ball rolled"... and
              the robot discovered an honest solution to make the ball roll faster, which was not what
              the designers were looking for.

              Analogously here, we're saying that the agent could find a similar `flaw' (from the
              designer's perspective) in the reward system, and continually dump and suck - this is just
              an observation. .. We are hypothesizing. .. as in "Imagine if..."
              ...imagine *if* the robot could *dump*, then you'd see that this suck-dump-loop could be
              a possibility, and to not allow that to happen, so we need to be careful about how the
              reward system is set up.

              I hope that clarifies it.

              Nima

              --- In aima-talk@yahoogrou ps.com, ozgur aydogan <aydogan_ozgur@ ...> wrote:
              >
              > Hi thanks for the answer, but , frankly, I know what Mr. Norvig wants to point by that
              exceptional action of the cleaner. And, I see your example as well. But there is a little
              difference between yours and the one I am mentioning about; your robot carried out the
              actions you stated at the beginning of your experiment. It went through a direction[must
              be only action you defined for it?] looking for the maximum goal you had described. Thus,
              it only carried out the action you wanted it to do[and that scratching dog like position was
              a result of your goal and action definitions. ]
              >
              > if you excuse me, I want to exemplify my question:
              >
              > Firstly, let's forget about the agent and just focus on the actions of it to abstract the
              example. Here are the actions:
              >
              > A, B, C, D and those are mutually exclusive and none of them has side effects[like
              DUMPing]. how would you get an extra action [say it E] out of this four ones? you can't
              unless you implied it[Unlike your robotic example, It only carried out the actions you
              described initially]. Although this looks trivial issue, it lacks information and bad example
              for an introductory example. Lastly, If I added an additional action of a Truth Table during
              the exam, my teacher would definitely have me failed :S
              >
              > Thanks again.
              >
              >
              >
              >
              >
              > --- 22/06/08 Pzr tarihinde Nima Talebi <nima.talebi@ ...> şöyle yazıyor:
              > Kimden: Nima Talebi <nima.talebi@ ...>
              > Konu: Re: [aima-talk] A serious logical error in chapter 2.
              > Kime: aima-talk@yahoogrou ps.com
              > Tarihi: 22 Haziran 2008 Pazar, 5:42
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              > I think the mention of `dump', not being in the action list of the agent, as you
              correctly stated, in this environment at least serves no purpose.
              > I think (please correct me if I'm wrong), Dr. Norvig mentions the action of a `dump' only
              as a means of driving a point home - If allowed to be the designer of it's own performance
              measure, it could easily make life easy for itself and, as an *example*, enter a loop of
              clean-->dump- ->clean-- >dump... and satisfy itself that it is performing extremely well,
              while in reality, it's doing nothing.
              >
              > Here is another example I've come across which may help...
              > There was a robotics experiment (reinforcement learning) where a robot had a ball-
              mouse attached to it behind, and it was `rewarded' for how much the ball rolled
              *forward*, then allowed to experiment in a room... hopefully to learn to move as fast as
              possible by itself.
              >
              > It was left to explore overnight, and in the morning, it was found sitting at an area in
              the room where there was a lump on the ground... and what the agent was found doing
              was this...
              >
              > Best way to draw a picture is this - imagine a dog rubbing it's behind on the carpet as to
              scratch it.
              > In effect - that is what it had learned to do. It was (incorrectly) being rewarded because
              the series of motions it had learned and was carrying was indeed honoring (cheating)
              the reward system...
              > Â * the ball was rolling forward faster than ever before.
              > This is analogous to the vacuum cleaner taking a dump and sucking it backup.  I
              hope I've answered your question? =)
              >
              > Nima
              >
              > On Sun, Jun 22, 2008 at 10:58 AM, aydogan_ozgur <aydogan_ozgur@ yahoo.com>
              wrote:
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              > In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of
              >
              > counting the number of cleaning a square by claiming that the
              >
              > cleaner,in which case, can successionly clean and DUMP the dirt of
              >
              > the same square, which is clearly worthless.
              >
              >
              >
              > However, there is a nontrivial glitch in this proposition. The Actions
              >
              > of the cleaner are
              >
              >
              >
              > to move right,
              >
              > to move left,
              >
              > to clean or
              >
              > to do nothing.
              >
              >
              >
              > In other words, there is no DUMP action, which Mr. Norvig uses to
              >
              > refute the proposition. So, I think, he can't just make up another
              >
              > action to refute the proposition. I hope, I am not exagerrating the case?
              >
              >
              >
              > Thanks for reading.
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              >
              > ____________ _________ _________ _________ _________ _________ _
              > Yahoo! kullaniyor musunuz? http://tr.mail. yahoo.com
              > Istenmeyen postadan biktiniz mi? Istenmeyen postadan en iyi korunma
              > Yahoo! Posta'da
              >



              Yahoo! kullaniyor musunuz?
              Istenmeyen postadan biktiniz mi? Istenmeyen postadan en iyi korunma Yahoo! Posta'da
              http://tr.mail.yahoo.com
            Your message has been successfully submitted and would be delivered to recipients shortly.