Loading ...
Sorry, an error occurred while loading the content.

TD-Update Reward calculation

Expand Messages
  • flamesplash <flamesplash@yahoo.com>
    In AIMA c1995 Figure 20.6 makes use of REWARD, however I don t see how this table is calculated or updated when 20.6 is used in 20.2. Additionally, in 20.6 is
    Message 1 of 1 , Jan 6, 2003
    • 0 Attachment
      In AIMA c1995

      Figure 20.6 makes use of REWARD, however I don't see how this table
      is calculated or updated when 20.6 is used in 20.2.

      Additionally, in 20.6 is the U[] (utility) of all terminal states
      simply their Utility? The Running-Average calculation doesn't look
      like it will ever actually change if the reward stays the same, and I
      don't really see how the reward can change.

      Any clarifications would be appreciated.

      -shane
    Your message has been successfully submitted and would be delivered to recipients shortly.