Loading ...
Sorry, an error occurred while loading the content.

FCS Lecture Summary - Finalized (Rev. 4)

Expand Messages
  • Shlomi Fish
    Enjoy, and flame away! Regards, Shlomi Fish ... * There are 8 stacks of cards, 4 freecells, and 4 foundations. The game is played with one deck. * At the
    Message 1 of 3 , Nov 2, 2001
    • 0 Attachment
      Enjoy, and flame away!

      Regards,

      Shlomi Fish

      Freecell's Rules:
      -----------------

      * There are 8 stacks of cards, 4 freecells, and 4 foundations. The game
      is played with one deck.

      * At the beginning of play, the cards are dealt to the stacks (4*7+4*6).
      The faces of all the cards are visible at the beginning of play.

      * A freecell can hold one _single_ card.

      * The foundations are built from ace to king and the object of the
      game is to move all the cards to the foundations.

      * A card can be placed on top of a card that is of the opposite colour,
      and is higher in rank by 1, assuming that the latter is at the top of a
      stack.

      * An empty stack may be filled with any card.

      * An atomic move consists of:
      - moving a card from a stack to a freecell.
      - moving a card from a freecell to a parent card on a stack.
      - moving a card from the top of a stack on top of a parent card on a
      different stack
      - moving a card from the top of a stack or from a freecell to the
      foundations.

      Strategies:
      -----------

      * A sequence of more than one card can be moved by moving the top cards to
      the freecells or to unoccupied stacks. The later is commonly called a
      "supermove"

      * Sometimes, it is useful to move cards to the freecells, so the card
      below them can serve as a basis for a sequence.

      The Freecell Solver 0.2 Architecture:
      -------------------------------------

      * A DFS scan:

      Solve-State(state, prev_states, ret)
      if (state == empty_state)
      Push(ret, state)
      return SOLVED
      for each move possible on state
      new_state <- state
      apply(new_state, move)
      if (new_state in prev_states)
      ; Do nothing
      else
      add new_state to prev_states
      if (Solve-State(new_state, prev_states, ret) == SOLVED)
      Push(ret, state)
      return SOLVED
      return UNSOLVED

      Freecell-Solver(init_state)
      if (Solve-State(init_state, Null, ret) == SOLVED)
      print "State was solved"
      while (state <- Pop(ret))
      print state
      else
      print "I could not solve this board";


      * Several tests to get to subsequent moves - give some examples:
      I called them "multi-move" to indicate they may include one or more
      moves or "supermoves". By coding them I tried to emulate the kind of
      moves a human player would try to do.

      Examples:
      ---------

      * Put top stacks cards in the foundations.
      * Put Freecell cards in the foundations.
      * Put non-top stack cards in the foundations. (by moving cards above them
      to freecells or vacant stacks).
      * Move stack cards to different stacks.
      * Move sequences of cards onto free stacks.
      * Etc.

      I later improved some of these moves and added an extra one.

      Due to their multi-moves heuristic nature, it is possible that some boards
      are solveable, yet cannot be solved by FCS.

      * Once the program finds a solution it outputs the intermediate boards to
      the screen.

      * Each state was represented as a relatively large data structure containing
      other data structures.
      - A card was { short, short}
      - A stack was { int, card[] },
      - A state was { stack[] }.

      * The State collection was implemented as a sorted vector of whole state data
      structures.
      - It had a sort margin, so not every new element would require moving
      many states aside.
      - After several elements were collected the array and its sort margin
      were qsort()ed.

      The State Collection Representation Improvements:
      -------------------------------------------------

      B-search based merge to add the sort margin instead of qsort:
      -------------------------------------------------------------

      * The sort margin is kept outside the main array.

      * It is added to the main array by using a binary search-based merge.

      - The reason why it was preferred over a normal linear merge
      was because there are usually much more elements in the main
      array so a lot of time would be spent on comparisons.

      * The merge was done from end to start, so there was no need in
      allocating an extra space.

      Pointers to structs instead of a vector of whole structs:
      ---------------------------------------------------------

      * I converted the vector to be a vector of pointer to structs, instead
      of vector of whole structs.

      * There was a great speed improvements, because only pointers were
      manipulated and not entire structs, whose movement in memory requires
      a lot of time.

      A balanced binary tree:
      -----------------------

      * I don't really know how to program my own balanced binary tree
      implementation, but I found some on the Net:

      1. libavl - contains implementations of AVL and Red-Black trees.
      2. libredblack - contains an implementation of a Red-Black tree.
      3. glib - contains an implementation of a balanced binary tree.

      There are others, but I decided to stick to those three.

      * By using a balanced binary tree I managed to increase the brute speed
      by %33 (and the net speed times 2 or so).

      A Hash Table:
      -------------

      Description: What is a Hash Table?
      ----------------------------------

      * A hash table is a vector of linked lists. Each linked list contains
      a set of keys (or key-value pairs).

      * The index of the list in which the key is placed is determined by
      a hash function. Usually, the hash function generates an integer
      in a very large range, and the index is taken as its modulo with the
      number of elements in the hash table.

      Rehashing:
      ----------

      Sometimes, when a hash becomes two crowded, the number of elements
      in it is increased and all of its existing elements are placed in
      their new place.


      * I first tried using glib's hash implementation with a 4-byte wide XOR as
      the hash function. This generated very awful results.

      * Then I tried using the MD5 hash function, and got excellent results,
      similar to what I encountered when using a balanced binary tree.

      * I coded my own stripped-down hash implementation in my attempt to figure
      out what went wrong with using a hash. It turned out to be a bit faster
      than glib's hash.

      Hash Optimizations:
      -------------------

      * Storing the hash values along with the keys. That way, the hash
      values can be first compared before comparing the entire keys.

      * Moving elements that were hit to the start of their chains.

      * When re-hashing, using the same malloc'ed elements, only re-linking
      them to their new siblings.

      The State Representation:
      -------------------------

      Reducing the Data Type Width:
      -----------------------------

      * At first FCS represented cards as a struct of two shorts where one was
      the rank and the other the suit.

      * Having seen that it consumed quite a lot of memory, I decided to try to
      reduce the size. I made a card as a char whose first 4 bits were the
      rank and the two more upper bits the rank.

      * Surprisingly, this made the game even faster.

      * I consulted a mailing list of mine with this anomaly and reached the
      conclusion that it happened because there were fewer cache misses
      this way.

      Pointers to stacks instead of a vector of stacks:
      -------------------------------------------------

      * I later encountered two variants of Freecell (dubbed Der Katzenschwanz
      and Die Schlange) whose stacks could be extremely long (roughly as
      long as the number of cards in two decks).

      * This made representing the states as a direct vector of constant-sized
      stacks very wasteful.

      * What I did was store pointers for the stacks in each state, and store
      the stacks in their separate cache.

      * If two states had an identical stack, this stack would be stored in
      memory only once.

      * With this technique, I managed to scale the number of states on a 48 MB
      + 1 GB swap machine to 1,000,000 and more.

      Remembering the Original Stack and Freecell Locations:
      ------------------------------------------------------

      * To avoid the possibility of having two states, which differ only in
      the order of their stacks or freecells, FCS 0.2 sorted the stacks in
      a canonical order, after every multi-move.

      * It displayed them this way, too, making the solution harder to follow.

      * In order to solve this problem, I wanted to keep the original order
      somehow.

      * If I sticked to the original order and sorted the stacks before every
      comparison, it would have costed dearly, because every insert requires
      more than one comparison.

      * The solution: "Indexes out of Bounds of Main Data Structure"<tm> -
      anti-patent pending technology.

      * The original indexes of the stacks and freecells are kept outside the
      main data structure:

      struct state
      {
      /* ... */
      };

      struct state_with_locations
      {
      struct state s;
      int fc_locs[4];
      int stack_locs[8];
      };

      * A comparison is made only on the first sizeof(struct state) bytes.

      Scans:
      ------

      The original Hard-DFS scan:
      ---------------------------

      * The original scan, which I dubbed Hard-DFS, used procedural
      recursion, to recurse into deeper states.

      * I wanted it to be pre-emptable: that it can be stopped in the middle
      of running, and resumed from the position which it reached.

      * That required back-tracking out of all the recursion levels.

      A*:
      ---

      * A* is a scan that uses a priority queue to determine which state to go
      to next. (instead of a stack in DFS, or a queue in BFS).

      * To implement it I had to override the recursion and instead enqueue only
      the states I wanted to check in the priority queue.

      Soft-DFS rev. 1:
      ----------------

      * When I ran the program on Win32, I discovered that it sometimes ran
      out of stack.

      * I later discovered I could tell the linker to increase the stack size
      of the program. Still, I wanted a solution that would not depend on such
      operating system idio-syncrecies.

      * I created a version of DFS that ran on its own dedicated stack.

      * This version run the test function, until it reached a new
      state, then the test function exited, and the scan would recurse into
      it.

      * If the state of origin was reached again, the test function would run
      from the beginning.

      Soft-DFS rev. 2:
      ----------------

      * The test function placed all the states which it could reach to, in
      their own list in the DFS recursion stack.

      * Each time, one of the states was taken, checked if it had not been
      reached before, and if not, recursed into.

      * I first considered not allowing a higher depth to recurse into states
      that were already queued in a lower depth, but discovered it greatly
      increased the time in which any given board was solved.

      * Such scan has the advantage that it can be stopped and resumed in O(1)
      complexity.

      * Note: programming the main function of the scan proved to be very
      problematic, and their were many fine points that I missed out in the
      first few versions. That was despite the fact that I first wrote a
      pseudocode for the function.

      The BFS optimization scan:
      --------------------------

      * I discovered that in some cases, DFS produced very long solutions.
      (Take PySol Board No. 980662 for example). Plus, it is quite obvious
      to the human player, that the solver is goofing around, and the solution
      is sub-optimal.

      * To avoid this, I coded a Breadth-First Search (BFS) scan, that ran only
      on the states in the original solution. ( a normal BFS scan takes way too
      long to be effective)

      * This dramatically reduced the length of the solution in many such
      cases.

      Move Stacks:
      ------------

      * In the beginning, the user had to deduce what had happened between two
      subsequent states.

      * That was:
      A. Not comfortable.
      B. Even harder for a Freecell implementation to do on its own.

      * I created the concept of move stacks, which each test loaded with the moves
      that it had performed.

      * Those stacks were later collected into one stack, to get the total move
      stack from the initial board to the final solution.


      TODO:
      -----

      * Add some statistics.

      * Fix spelling and grammatical errors.



      ----------------------------------------------------------------------
      Shlomi Fish shlomif@...
      Home Page: http://t2.technion.ac.il/~shlomif/
      Home E-mail: shlomif@...

      If:
      1. A is A
      2. A is not not-A
      does it imply that
      1. B is B
      2. B is not not-B
    • mulix
      ... this is an interesting point. did you consider to there may be a brute force method which might be able to solve all boards? is such a brute force method
      Message 2 of 3 , Nov 3, 2001
      • 0 Attachment
        On Sat, 3 Nov 2001, Shlomi Fish wrote:

        >
        > Enjoy, and flame away!
        >
        > Regards,
        >
        > Shlomi Fish
        >
        > Freecell's Rules:
        > -----------------
        >
        > * There are 8 stacks of cards, 4 freecells, and 4 foundations. The game
        > is played with one deck.
        >
        > * At the beginning of play, the cards are dealt to the stacks (4*7+4*6).
        > The faces of all the cards are visible at the beginning of play.
        >
        > * A freecell can hold one _single_ card.
        >
        > * The foundations are built from ace to king and the object of the
        > game is to move all the cards to the foundations.
        >
        > * A card can be placed on top of a card that is of the opposite colour,
        > and is higher in rank by 1, assuming that the latter is at the top of a
        > stack.
        >
        > * An empty stack may be filled with any card.
        >
        > * An atomic move consists of:
        > - moving a card from a stack to a freecell.
        > - moving a card from a freecell to a parent card on a stack.
        > - moving a card from the top of a stack on top of a parent card on a
        > different stack
        > - moving a card from the top of a stack or from a freecell to the
        > foundations.
        >
        > Strategies:
        > -----------
        >
        > * A sequence of more than one card can be moved by moving the top cards to
        > the freecells or to unoccupied stacks. The later is commonly called a
        > "supermove"
        >
        > * Sometimes, it is useful to move cards to the freecells, so the card
        > below them can serve as a basis for a sequence.
        >
        > The Freecell Solver 0.2 Architecture:
        > -------------------------------------
        >
        > * A DFS scan:
        >
        > Solve-State(state, prev_states, ret)
        > if (state == empty_state)
        > Push(ret, state)
        > return SOLVED
        > for each move possible on state
        > new_state <- state
        > apply(new_state, move)
        > if (new_state in prev_states)
        > ; Do nothing
        > else
        > add new_state to prev_states
        > if (Solve-State(new_state, prev_states, ret) == SOLVED)
        > Push(ret, state)
        > return SOLVED
        > return UNSOLVED
        >
        > Freecell-Solver(init_state)
        > if (Solve-State(init_state, Null, ret) == SOLVED)
        > print "State was solved"
        > while (state <- Pop(ret))
        > print state
        > else
        > print "I could not solve this board";
        >
        >
        > * Several tests to get to subsequent moves - give some examples:
        > I called them "multi-move" to indicate they may include one or more
        > moves or "supermoves". By coding them I tried to emulate the kind of
        > moves a human player would try to do.
        >
        > Examples:
        > ---------
        >
        > * Put top stacks cards in the foundations.
        > * Put Freecell cards in the foundations.
        > * Put non-top stack cards in the foundations. (by moving cards above them
        > to freecells or vacant stacks).
        > * Move stack cards to different stacks.
        > * Move sequences of cards onto free stacks.
        > * Etc.
        >
        > I later improved some of these moves and added an extra one.
        >
        > Due to their multi-moves heuristic nature, it is possible that some boards
        > are solveable, yet cannot be solved by FCS.

        this is an interesting point. did you consider to there may be a brute
        force method which might be able to solve all boards? is such a brute
        force method feasible, in terms or time/space complexity?

        > * Once the program finds a solution it outputs the intermediate boards to
        > the screen.
        >
        > * Each state was represented as a relatively large data structure containing
        > other data structures.
        > - A card was { short, short}
        > - A stack was { int, card[] },
        > - A state was { stack[] }.
        >
        > * The State collection was implemented as a sorted vector of whole state data
        > structures.
        > - It had a sort margin, so not every new element would require moving
        > many states aside.
        > - After several elements were collected the array and its sort margin
        > were qsort()ed.
        >
        > The State Collection Representation Improvements:
        > -------------------------------------------------
        >
        > B-search based merge to add the sort margin instead of qsort:
        > -------------------------------------------------------------
        >
        > * The sort margin is kept outside the main array.
        >
        > * It is added to the main array by using a binary search-based merge.
        >
        > - The reason why it was preferred over a normal linear merge
        > was because there are usually much more elements in the main
        > array so a lot of time would be spent on comparisons.
        >
        > * The merge was done from end to start, so there was no need in
        > allocating an extra space.
        >
        > Pointers to structs instead of a vector of whole structs:
        > ---------------------------------------------------------
        >
        > * I converted the vector to be a vector of pointer to structs, instead
        > of vector of whole structs.
        >
        > * There was a great speed improvements, because only pointers were
        > manipulated and not entire structs, whose movement in memory requires
        > a lot of time.
        >
        > A balanced binary tree:
        > -----------------------
        >
        > * I don't really know how to program my own balanced binary tree
        > implementation, but I found some on the Net:
        >
        > 1. libavl - contains implementations of AVL and Red-Black trees.
        > 2. libredblack - contains an implementation of a Red-Black tree.
        > 3. glib - contains an implementation of a balanced binary tree.
        >
        > There are others, but I decided to stick to those three.
        >
        > * By using a balanced binary tree I managed to increase the brute speed
        > by %33 (and the net speed times 2 or so).
        >
        > A Hash Table:
        > -------------
        >
        > Description: What is a Hash Table?
        > ----------------------------------
        >
        > * A hash table is a vector of linked lists. Each linked list contains
        > a set of keys (or key-value pairs).

        that's not what a hash table is. that's what an *implementation* of a
        hash table might look like.

        you might be interested in checking out the lecture notes for the
        technion's cs department's 'data structures one' class. they have very
        good lectures on avl, b trees and hash tables. (note - last year they
        also had quite a few small but annoyiing mistakes in the slides - i hope
        they are better this year). the slides are here:
        http://www.cs.technion.ac.il/~dang/courseDS/

        > * The index of the list in which the key is placed is determined by
        > a hash function. Usually, the hash function generates an integer
        > in a very large range, and the index is taken as its modulo with the
        > number of elements in the hash table.

        again, that's one specific implementation, not necessarily the best.

        > Rehashing:
        > ----------
        >
        > Sometimes, when a hash becomes two crowded, the number of elements
        > in it is increased and all of its existing elements are placed in
        > their new place.
        >
        >
        > * I first tried using glib's hash implementation with a 4-byte wide XOR as
        > the hash function. This generated very awful results.
        >
        > * Then I tried using the MD5 hash function, and got excellent results,
        > similar to what I encountered when using a balanced binary tree.
        >
        > * I coded my own stripped-down hash implementation in my attempt to figure
        > out what went wrong with using a hash. It turned out to be a bit faster
        > than glib's hash.

        this is not suprising: you replaced a general version with a private
        version tailored for your needs.

        >
        > Hash Optimizations:
        > -------------------
        >
        > * Storing the hash values along with the keys. That way, the hash
        > values can be first compared before comparing the entire keys.

        that's a very nice trick, but why did you need to store the hash value
        explicitly? isn't the hash value the index of the cell in the array? (i
        suspect you use 'hash value' here to refer to the very large number,
        before you did the modulu, correct?)

        > * Moving elements that were hit to the start of their chains.

        why?

        > * When re-hashing, using the same malloc'ed elements, only re-linking
        > them to their new siblings.

        of course.

        you might wish to explain in a few words how you *profiled* your code,
        before optimizing it (i hope you profiled it...)

        [snipped the rest as i didnt have any comments]

        --
        mulix

        http://www.pointer.co.il/~mulix/
        http://syscalltrack.sf.net/
      • Shlomi Fish
        ... Several people wrote atomic moves-based Freecell solvers and they do not work too bad. I decided to use multi-moves in the first place, because I thought
        Message 3 of 3 , Nov 3, 2001
        • 0 Attachment
          On Sat, 3 Nov 2001, mulix wrote:

          > > I later improved some of these moves and added an extra one.
          > >
          > > Due to their multi-moves heuristic nature, it is possible that some boards
          > > are solveable, yet cannot be solved by FCS.
          >
          > this is an interesting point. did you consider to there may be a brute
          > force method which might be able to solve all boards? is such a brute
          > force method feasible, in terms or time/space complexity?
          >

          Several people wrote atomic moves-based Freecell solvers and they do not
          work too bad. I decided to use multi-moves in the first place, because I
          thought they would yield better results.

          > > The State Collection Representation Improvements:
          > > -------------------------------------------------
          > >
          [ Snipped the B-search, sort-margin and balanced binary tree stuff ]
          > > A Hash Table:
          > > -------------
          > >
          > > Description: What is a Hash Table?
          > > ----------------------------------
          > >
          > > * A hash table is a vector of linked lists. Each linked list contains
          > > a set of keys (or key-value pairs).
          >
          > that's not what a hash table is. that's what an *implementation* of a
          > hash table might look like.
          >

          Well, AFAIK a hash is an implementation of the dictionary abstract data
          type (ADT). Naturally, hashes comes in many forms and sizes. I realize it
          is also possible to use open addressing for a hash. However, I do not want
          to cover the entire theory of hashing in the lecture.

          > you might be interested in checking out the lecture notes for the
          > technion's cs department's 'data structures one' class. they have very
          > good lectures on avl, b trees and hash tables. (note - last year they
          > also had quite a few small but annoyiing mistakes in the slides - i hope
          > they are better this year). the slides are here:
          > http://www.cs.technion.ac.il/~dang/courseDS/
          >

          I took the EE equivalent course (or at least half of the two equivalent EE
          courses).

          > > * The index of the list in which the key is placed is determined by
          > > a hash function. Usually, the hash function generates an integer
          > > in a very large range, and the index is taken as its modulo with the
          > > number of elements in the hash table.
          >
          > again, that's one specific implementation, not necessarily the best.
          >

          I'm not saying it is.

          > > Rehashing:
          > > ----------
          > >
          > > Sometimes, when a hash becomes two crowded, the number of elements
          > > in it is increased and all of its existing elements are placed in
          > > their new place.
          > >
          > >
          > > * I first tried using glib's hash implementation with a 4-byte wide XOR as
          > > the hash function. This generated very awful results.
          > >
          > > * Then I tried using the MD5 hash function, and got excellent results,
          > > similar to what I encountered when using a balanced binary tree.
          > >
          > > * I coded my own stripped-down hash implementation in my attempt to figure
          > > out what went wrong with using a hash. It turned out to be a bit faster
          > > than glib's hash.
          >
          > this is not suprising: you replaced a general version with a private
          > version tailored for your needs.
          >

          Perhaps, but it is also possible mine is written better.

          > >
          > > Hash Optimizations:
          > > -------------------
          > >
          > > * Storing the hash values along with the keys. That way, the hash
          > > values can be first compared before comparing the entire keys.
          >
          > that's a very nice trick, but why did you need to store the hash value
          > explicitly? isn't the hash value the index of the cell in the array? (i
          > suspect you use 'hash value' here to refer to the very large number,
          > before you did the modulu, correct?)
          >

          That's right. The hash value is the value before the modulo is taken.

          > > * Moving elements that were hit to the start of their chains.
          >
          > why?
          >

          My assumption is that elements that were hit a lot are likely to be hit
          again often. That's why I wish them to be closer to the start of the list
          as possible so that redunant images would be eliminated soon.

          > > * When re-hashing, using the same malloc'ed elements, only re-linking
          > > them to their new siblings.
          >
          > of course.
          >
          > you might wish to explain in a few words how you *profiled* your code,
          > before optimizing it (i hope you profiled it...)
          >

          Actually, many times I implemented such optimizations out of necessity, or
          because I thought they would generate good results. Then, after I compared
          them with the previous results, I realized they were better in time or in
          memory.

          Regards,

          Shlomi Fish

          > [snipped the rest as i didnt have any comments]
          >
          > --
          > mulix
          >
          > http://www.pointer.co.il/~mulix/
          > http://syscalltrack.sf.net/
          >
          >
          >
          >
          > To unsubscribe from this group, send an email to:
          > hackers-il-unsubscribe@egroups.com
          >
          >
          >
          > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
          >
          >



          ----------------------------------------------------------------------
          Shlomi Fish shlomif@...
          Home Page: http://t2.technion.ac.il/~shlomif/
          Home E-mail: shlomif@...

          If:
          1. A is A
          2. A is not not-A
          does it imply that
          1. B is B
          2. B is not not-B
        Your message has been successfully submitted and would be delivered to recipients shortly.