Loading ...
Sorry, an error occurred while loading the content.

Re: A serious logical error in chapter 2.

Expand Messages
  • daashmashty
    You re argument is that dumping is not in the action list - but that s not the point that s being made here. The key that is being conveyed here is this... As
    Message 1 of 5 , Jun 23 1:52 AM
    • 0 Attachment
      You're argument is that dumping is not in the action list - but that's not the point that's
      being made here.

      The key that is being conveyed here is this...

      "As a general rule, it is better to design a performance measure according to what one
      actually wants the in the environment, rather than according to how one thinks the agent
      should behave"

      ----

      In the earlier example I gave, the flaw was a good example, as rather than looking at "How
      much has the agent travelled?", they were looking at "How much has the ball rolled"... and
      the robot discovered an honest solution to make the ball roll faster, which was not what
      the designers were looking for.

      Analogously here, we're saying that the agent could find a similar `flaw' (from the
      designer's perspective) in the reward system, and continually dump and suck - this is just
      an observation... We are hypothesizing... as in "Imagine if..."
      ...imagine *if* the robot could *dump*, then you'd see that this suck-dump-loop could be
      a possibility, and to not allow that to happen, so we need to be careful about how the
      reward system is set up.

      I hope that clarifies it.

      Nima

      --- In aima-talk@yahoogroups.com, ozgur aydogan <aydogan_ozgur@...> wrote:
      >
      > Hi thanks for the answer, but , frankly, I know what Mr. Norvig wants to point by that
      exceptional action of the cleaner. And, I see your example as well. But there is a little
      difference between yours and the one I am mentioning about; your robot carried out the
      actions you stated at the beginning of your experiment. It went through a direction[must
      be only action you defined for it?] looking for the maximum goal you had described. Thus,
      it only carried out the action you wanted it to do[and that scratching dog like position was
      a result of your goal and action definitions.]
      >
      > if you excuse me, I want to exemplify my question:
      >
      > Firstly, let's forget about the agent and just focus on the actions of it to abstract the
      example. Here are the actions:
      >
      > A, B, C, D and those are mutually exclusive and none of them has side effects[like
      DUMPing].  how would you get an extra action [say it E] out of this four ones? you can't
      unless you implied it[Unlike your robotic example, It only carried out the actions you
      described initially]. Although this looks trivial issue, it lacks information and bad example
      for an introductory example. Lastly, If I added an additional action of a Truth Table during
      the exam, my teacher would definitely have me failed :S
      >
      > Thanks again.
      >
      >
      >
      >
      >
      > --- 22/06/08 Pzr tarihinde Nima Talebi <nima.talebi@...> şöyle yazıyor:
      > Kimden: Nima Talebi <nima.talebi@...>
      > Konu: Re: [aima-talk] A serious logical error in chapter 2.
      > Kime: aima-talk@yahoogroups.com
      > Tarihi: 22 Haziran 2008 Pazar, 5:42
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      > I think the mention of `dump', not being in the action list of the agent, as you
      correctly stated, in this environment at least serves no purpose.
      > I think (please correct me if I'm wrong), Dr. Norvig mentions the action of a `dump' only
      as a means of driving a point home - If allowed to be the designer of it's own performance
      measure, it could easily make life easy for itself and, as an *example*, enter a loop of
      clean-->dump-->clean-->dump... and satisfy itself that it is performing extremely well,
      while in reality, it's doing nothing.
      >
      > Here is another example I've come across which may help...
      > There was a robotics experiment (reinforcement learning) where a robot had a ball-
      mouse attached to it behind, and it was `rewarded' for how much the ball rolled
      *forward*, then allowed to experiment in a room... hopefully to learn to move as fast as
      possible by itself.
      >
      > It was left to explore overnight, and in the morning, it was found sitting at an area in
      the room where there was a lump on the ground... and what the agent was found doing
      was this...
      >
      > Best way to draw a picture is this - imagine a dog rubbing it's behind on the carpet as to
      scratch it.
      > In effect - that is what it had learned to do. It was (incorrectly) being rewarded because
      the series of motions it had learned and was carrying was indeed honoring (cheating)
      the reward system...
      >  * the ball was rolling forward faster than ever before.
      > This is analogous to the vacuum cleaner taking a dump and sucking it backup.  I
      hope I've answered your question? =)
      >
      > Nima
      >
      > On Sun, Jun 22, 2008 at 10:58 AM, aydogan_ozgur <aydogan_ozgur@ yahoo.com>
      wrote:
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      > In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of
      >
      > counting the number of cleaning a square by claiming that the
      >
      > cleaner,in which case, can successionly clean and DUMP the dirt of
      >
      > the same square, which is clearly worthless.
      >
      >
      >
      > However, there is a nontrivial glitch in this proposition. The Actions
      >
      > of the cleaner are
      >
      >
      >
      > to move right,
      >
      > to move left,
      >
      > to clean or
      >
      > to do nothing.
      >
      >
      >
      > In other words, there is no DUMP action, which Mr. Norvig uses to
      >
      > refute the proposition. So, I think, he can't just make up another
      >
      > action to refute the proposition. I hope, I am not exagerrating the case?
      >
      >
      >
      > Thanks for reading.
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      >
      > ___________________________________________________________________
      > Yahoo! kullaniyor musunuz? http://tr.mail.yahoo.com
      > Istenmeyen postadan biktiniz mi? Istenmeyen postadan en iyi korunma
      > Yahoo! Posta'da
      >
    • ozgur aydogan
      Hi Nami, again. I think you are mixing the point I am emphasizing and Mr. Norvig does. I definitely agree with the Mr. Norvig s performance measurement
      Message 2 of 5 , Jun 24 8:55 AM
      • 0 Attachment

        Hi Nami, again. I think you are mixing the point I am emphasizing and Mr. Norvig does. I definitely agree with the Mr. Norvig's performance measurement criteria:

        "As a general rule, it is better to design a performance measure according to what one
        actually wants the in the environment, rather than according to how one thinks the agent
        should behave"

        I think I didn't say something against to this criteria. however, let's delve into this.

        I want to change the enviroment, yes this is my goal.

        Q - and what  do I have to change the environment?
        A - I have my agent

        Q - What does my agent have to change the environment?
        A - It has actions to change the environment.

        Q - What actions does my agent have to change the environment ?
        A - It has A,B,C and D.

        Q - Can my agent have another actions to change the environment other than the ones I specified above? for example, an E action.
        A -  No, it can't.

        Q - So, should I specify my goal according to my actions so that the environment will change the way that maximizes my expactation/goal?
        A - Yes, Exactly.

        Q - But wait, there might appear some side effects stemming from those 4 actions.
        A - In that case you should have specified it among your actions.

        As It is seen, I didn't claim anything against that performance measure criteria. so giving it as example to refute my claim is kind of off the topic.

        [[ In the earlier example I gave, the flaw was a good example, as rather than looking at "How
        much has the agent travelled?", they were looking at "How much has the ball rolled"... and ]]

        you should consider your agent's afore specified actions , not its human point of view implications; in either case [say it travelled or rolled the ball] your agent was *MOVING* [definition: to excitite  engine with some current]  to some direction to maximize your goal [ *MOVE* forward as much as possible ]. In this example, as i said before, your agent only carried out the actions you stated before, it *MOVED* and found a lump to maximize its goal. however, if it had carried out another action that would be analogous to my example; such as flying, or bouncing [which you hadn't specified before the experiment].

        This example of Mr norvig is similar to teacher someone to play football with this action:

        Action:
        You can take the ball anywhere you want excecpt the border lines.
        You will use ONLY your feet to take the ball.

        Goal:
        Throw the ball opposite team's goalpost as much as possible, do not let them do the same thing and do not hurt them.

        Result:
        Players use ONLY their feet -as expected - and you punish them since they don't use their HEADs  to score-which you didn't include in the action list. This is so ridicilous, teacher should have told the rules better; this is his own fault not the players.




        --- 23/06/08 Pzt tarihinde daashmashty <nima.talebi@...> şöyle yazıyor:
        Kimden: daashmashty <nima.talebi@...>
        Konu: [aima-talk] Re: A serious logical error in chapter 2.
        Kime: aima-talk@yahoogroups.com
        Tarihi: 23 Haziran 2008 Pazartesi, 11:52

        You're argument is that dumping is not in the action list - but that's not the point that's
        being made here.

        The key that is being conveyed here is this...

        "As a general rule, it is better to design a performance measure according to what one
        actually wants the in the environment, rather than according to how one thinks the agent
        should behave"

        ----

        In the earlier example I gave, the flaw was a good example, as rather than looking at "How
        much has the agent travelled?", they were looking at "How much has the ball rolled"... and
        the robot discovered an honest solution to make the ball roll faster, which was not what
        the designers were looking for.

        Analogously here, we're saying that the agent could find a similar `flaw' (from the
        designer's perspective) in the reward system, and continually dump and suck - this is just
        an observation. .. We are hypothesizing. .. as in "Imagine if..."
        ...imagine *if* the robot could *dump*, then you'd see that this suck-dump-loop could be
        a possibility, and to not allow that to happen, so we need to be careful about how the
        reward system is set up.

        I hope that clarifies it.

        Nima

        --- In aima-talk@yahoogrou ps.com, ozgur aydogan <aydogan_ozgur@ ...> wrote:
        >
        > Hi thanks for the answer, but , frankly, I know what Mr. Norvig wants to point by that
        exceptional action of the cleaner. And, I see your example as well. But there is a little
        difference between yours and the one I am mentioning about; your robot carried out the
        actions you stated at the beginning of your experiment. It went through a direction[must
        be only action you defined for it?] looking for the maximum goal you had described. Thus,
        it only carried out the action you wanted it to do[and that scratching dog like position was
        a result of your goal and action definitions. ]
        >
        > if you excuse me, I want to exemplify my question:
        >
        > Firstly, let's forget about the agent and just focus on the actions of it to abstract the
        example. Here are the actions:
        >
        > A, B, C, D and those are mutually exclusive and none of them has side effects[like
        DUMPing]. how would you get an extra action [say it E] out of this four ones? you can't
        unless you implied it[Unlike your robotic example, It only carried out the actions you
        described initially]. Although this looks trivial issue, it lacks information and bad example
        for an introductory example. Lastly, If I added an additional action of a Truth Table during
        the exam, my teacher would definitely have me failed :S
        >
        > Thanks again.
        >
        >
        >
        >
        >
        > --- 22/06/08 Pzr tarihinde Nima Talebi <nima.talebi@ ...> şöyle yazıyor:
        > Kimden: Nima Talebi <nima.talebi@ ...>
        > Konu: Re: [aima-talk] A serious logical error in chapter 2.
        > Kime: aima-talk@yahoogrou ps.com
        > Tarihi: 22 Haziran 2008 Pazar, 5:42
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        > I think the mention of `dump', not being in the action list of the agent, as you
        correctly stated, in this environment at least serves no purpose.
        > I think (please correct me if I'm wrong), Dr. Norvig mentions the action of a `dump' only
        as a means of driving a point home - If allowed to be the designer of it's own performance
        measure, it could easily make life easy for itself and, as an *example*, enter a loop of
        clean-->dump- ->clean-- >dump... and satisfy itself that it is performing extremely well,
        while in reality, it's doing nothing.
        >
        > Here is another example I've come across which may help...
        > There was a robotics experiment (reinforcement learning) where a robot had a ball-
        mouse attached to it behind, and it was `rewarded' for how much the ball rolled
        *forward*, then allowed to experiment in a room... hopefully to learn to move as fast as
        possible by itself.
        >
        > It was left to explore overnight, and in the morning, it was found sitting at an area in
        the room where there was a lump on the ground... and what the agent was found doing
        was this...
        >
        > Best way to draw a picture is this - imagine a dog rubbing it's behind on the carpet as to
        scratch it.
        > In effect - that is what it had learned to do. It was (incorrectly) being rewarded because
        the series of motions it had learned and was carrying was indeed honoring (cheating)
        the reward system...
        > Â * the ball was rolling forward faster than ever before.
        > This is analogous to the vacuum cleaner taking a dump and sucking it backup.  I
        hope I've answered your question? =)
        >
        > Nima
        >
        > On Sun, Jun 22, 2008 at 10:58 AM, aydogan_ozgur <aydogan_ozgur@ yahoo.com>
        wrote:
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        > In Performance Measures of Chapter 2, Mr. Norvig refutes the goal of
        >
        > counting the number of cleaning a square by claiming that the
        >
        > cleaner,in which case, can successionly clean and DUMP the dirt of
        >
        > the same square, which is clearly worthless.
        >
        >
        >
        > However, there is a nontrivial glitch in this proposition. The Actions
        >
        > of the cleaner are
        >
        >
        >
        > to move right,
        >
        > to move left,
        >
        > to clean or
        >
        > to do nothing.
        >
        >
        >
        > In other words, there is no DUMP action, which Mr. Norvig uses to
        >
        > refute the proposition. So, I think, he can't just make up another
        >
        > action to refute the proposition. I hope, I am not exagerrating the case?
        >
        >
        >
        > Thanks for reading.
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        >
        > ____________ _________ _________ _________ _________ _________ _
        > Yahoo! kullaniyor musunuz? http://tr.mail. yahoo.com
        > Istenmeyen postadan biktiniz mi? Istenmeyen postadan en iyi korunma
        > Yahoo! Posta'da
        >



        Yahoo! kullaniyor musunuz?
        Istenmeyen postadan biktiniz mi? Istenmeyen postadan en iyi korunma Yahoo! Posta'da
        http://tr.mail.yahoo.com
      Your message has been successfully submitted and would be delivered to recipients shortly.