Loading ...
Sorry, an error occurred while loading the content.

13alpha-beta code in chapter 5, p132

Expand Messages
  • Wheeler Ruml
    Oct 4, 2001
      (First off, I'd like to say that I love this textbook and use it all
      the time, for teaching and to get up to research on topics outside my
      specialty. Hurrah for Stuart and Peter!)

      I'm not sure that this counts as a bug, but I think that it's
      confusing that the code in figure 5.8 doesn't actually pass up minimax
      values. In some cases, alpha and beta values are passed back up the
      tree, creating some confusion. Take the following example:

      / \
      / \
      n2 n3
      | / \
      0 / \
      n4 ...

      Node n1 is a max node. We init alpha to -inf and beta to inf. At
      node n2, this game reaches a terminal state with evaluation = 0. This
      value is passed up to n1, where it is used to update alpha (n1 being a
      max node) to 0. A call on n3 results in a call on n4 results in a
      call on n5. This is a terminal state, so its value of -3 is passed
      back to n4. n4 is a max node which had inherited an alpha of 0 and a
      beta of inf. According to the code in the book, the alpha value at n4
      is updated to the max of the current value 0 and the child value -3.
      Alpha remains 0. Furthermore, alpha is passed back on exit from n4.
      At n3 (which had been called with alpha = 0 and beta = inf), we are
      updating beta to the min of its current value inf and its child's
      value, which is 0. So beta becomes 0, which is = to alpha, so we
      prune away any other children on n3 and return 0 to the call from n1.
      Note that both n2 and n3 now have value 0 from the perspective of n1!
      I consider this a bug, since n3 should have a minimax value of -3 or
      less. We pruned correctly, but the wrong value was passed back. The
      strategy taken by the code in the book only works if we break ties at
      the top level by preferring the leftmost action involved in the tie.
      The solution is to be careful to always pass back the minimax value,
      and not alpha or beta. See the revised pseudo-code below:

      Max-value (state, alpha, beta):
      when depth-cutoff (state), return SEF(state)
      value <-- -inf
      for each child of state
      value <-- max(value, Min-value (child, alpha, beta)
      when value > or = beta, return value
      return value

      Min-value (state, alpha, beta):
      when depth-cutoff (state), return SEF(state)
      value <-- inf
      for each child of state
      value <-- min(value, Max-value (child, alpha, beta)
      when value < or = alpha, return value
      return value

      The code in the book seems to be sticking close to the Knuth and Moore
      1975 AIJ paper, computing the tree in K&M's fig 3. I haven't used a
      fine-toothed comb, but I don't see K&M mentioning the tie issue
      either. It only comes up when one programs an actual player, but I
      would like to suggest that the 2nd edition make this a little more
      explicit, if not adopt the revised pseudo-code above.

      (Credit goes to Courtney O'Keefe and Jeff Shneidman who brought this
      case to my attention.)

      Thanks again for a wonderful book!

      Wheeler Ruml
      Instructor for CS 182, Harvard University
      Wheeler Ruml, http://www.eecs.harvard.edu/~ruml/, 617-495-2081 voice
      ruml@..., Maxwell Dworkin Lab 219 617-496-1066 fax
    • Show all 2 messages in this topic