455learning run-time A* pseudo code problem

  • Ivan Villanueva
    Jan 5, 2005
      I'm trying to implement the Learning Run-Time A* Agent on page 128 of
      the international edition.

      Trying to follow the iteration in the figure 4.22, I think there is a
      mistake because in line (b), when the agent has move from the first
      state to the second, the updated value of the first state should be 2
      and not 3. The reason is that from that state there are two possible
      moves, left and right, and the agent knows the outcome of the move to
      the right but not the outcome to the left, thus, its updated value
      should be (according to the LRTA*-Cost function) its heuristic, and not
      its heuristic + 1, as in the book is printed.

      Furthermore, if I'm right, the algorithm won't work because it
      could just loop between the two states with the heuristic value 2.

      I think one possible solution is to change the line:
      a <-- an action b in ACTIONS(ss) ...
      a <-- a not yet tried action, or, if all possible actions have been
      tried before, an action b in ACTIONS(ss) ...

      Am I right ?

      Iván Villanueva.