Loading ...
Sorry, an error occurred while loading the content.

Re: INDIRECT_STACK_STATES versus COMPACT_STATES?

Expand Messages
  • Shlomi Fish
    Hi Mr. Parker! You have replied to me privately instead of to the list, despite the Reply-To: header present there. Since I don t see anything in the E-mail
    Message 1 of 6 , Apr 23 1:59 AM
    View Source
    • 0 Attachment
      Hi Mr. Parker!

      You have replied to me privately instead of to the list, despite the
      "Reply-To:" header present there. Since I don't see anything in the E-mail
      that suggests it should be kept between the two of us, I'm cc'ing the list on
      this.

      I hope you don't mind and please reply to the list the next time.

      On Friday 23 April 2004 10:15, you wrote:
      > Thanks for getting back to me so quickly. Let me skip over for the
      > moment replying to your reply to me below. I want to report my
      > adventures with your program and various compilers on Windows.
      >
      > I'm compiling your program with MSVC v6, a later MS C++ compiler from
      > their VS 2003 Toolkit, and with Intel C++ v7.1. I have some observations
      > that might (or might not :-) ) interest you:
      >
      > 1) Instead of redefining max you should either use the compiler's
      > version or create a different macro name entirely that will not collide
      > with a name from the C RTL. Otherwise we see things like this (and
      > warning messages are best avoided IMO):
      > cl /GL /G7 /Ox /c /Fosimpsim.obj /DWIN32 simpsim.c
      > Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 13.10.3052 for 80x86
      > Copyright (C) Microsoft Corporation 1984-2002. All rights reserved.
      >
      > simpsim.c
      > C:\Program Files\Microsoft Visual C++ Toolkit 2003\include\stdlib.h(424)
      >
      > : warning C4005: 'max' : macro redefinition
      >
      > j:\biz\games\freecell\fishmine\bldtk2003opt\move.h(105) : see
      > previous definition of 'max'
      >

      Hmmm... well I defined max for a reason. It probably does not exist in the
      UNIX C Standard Library (the so-called "libc"). What I can do to eliminate
      the warning is to use the following:

      #ifndef max
      #define max(a,b) (((a)>(b))?(a):(b))
      #endif

      In any case, I prefer that this macro would be called "max" because this is
      the common name for a macro of this functionality.

      > 2) In lib.c on line 884 you do a call to the user->iter_handler()
      > function. You use its result to do a return call. But both the function
      > you are in there:
      > freecell_solver_user_iter_handler_wrapper
      > and the function
      > iter_handler
      > are of void return type. So why the return? The compilers all warn on
      > that. Makes sense to me that they warn. Granted you are returning a void
      > to a void type. But why return at all if it is void? Do you sometimes
      > make it an int for error checking in debug builds? You'd have to change
      > that in a few places to do that in that case.
      > The iter_handler is of this type from fcs_uiser.h:
      > typedef void (*freecell_solver_user_iter_handler_t)
      > (
      > void * user_instance,
      > int iter_num,
      > int depth,
      > void * ptr_state,
      > int parent_iter_num,
      > void * context
      > );
      >

      Hmmm...

      I don't recall why I did that. Perhaps it was a forward-thinking in case they
      would return something in the future. Or I may have thought that it was
      easier than calling the function and then an explicit return. Perhaps it just
      seemed intuitive to me, and gcc did not warn me about it.

      I'll correct it.

      > 3) I initially couldn't compile all the way on Intel's compiler. It
      > (seemingly incorrectly) treats line 11 in lookup2.c as a non-comment
      > line. The reason it does that appears to be a control character (value
      > 0x00) on that line before the period there. Taking out that character
      > lets the Intel compiler handle it.
      >

      Hmmm... well, this .c module is derived from an original code from a different
      source. I recall it had a few weird characters like that which I removed.
      Guess I forgot this one. I'll fix it as well.

      > 4) You can download the MS Visual C++ Toolkit 2003 for free here:
      > http://msdn.microsoft.com/visualc/vctoolkit2003/
      > It is just the command line utilities. Again it is FREE. You need at
      > least Win2000 to run it.
      >

      Well, it's free as in "free beer", but not free as in "free speech". ;-)
      I suppose I can try running it on my WinXP Home laptop, albeit I'm doing my
      best to avoid working there as much as I can. (and more generally, doing my
      best to avoid working on Windows as much as I can). Namely, I cannot run it
      on Linux, which is my platform of choice, whether for development or for any
      other thing.

      Still, I might take a look.

      > 5) Your solver takes about 70% longer to run with the MS Visual C++
      > Toolkit 2003 as compared to MSVC v6. I've tried all sorts of compiler
      > flags and haven't figured out why yet. That web page claims it is
      > optiimizing. I've tried
      > CC=cl /GL /G7 /Ox
      > and it was even slightly (a couple of percent) slower than not having
      > all that there. I do not know why this is the case. The later compiler
      > from MS is supposed to generate better code. Maybe
      >

      Maybe what? In any case, I don't see a reason why it should be so worse. It's
      quite interesting.

      > 6) Intel v7.1 provides a small speed-up as compared to MSVC v6. For
      > instance, using a 2.2 Ghz P4 and Windows 2000 the -mi 700000 argument
      > which (among other things) prevents game 9 from being solved causes the
      > time to go from game 1 thru game 9 to happen in 17 seconds on Intel v7.1
      > versus 18 seconds on MSVC v7. The Intel version takes 226 seconds to do
      > the first 1000 versus 249 with MSVC v6.
      >

      OK.

      > 7) Intel v7.1 has an internal compiler error that causes it to fail
      > linking of your solver if /GL global optimization is used. Possibly the
      > newer Intel v8.0 fixes that bug.
      >

      Perhaps. It seems like a compiler bug.

      > I haven't spent all that much time studying this program yet. But my
      > guess is that your heap operations may be causing a performance hit. In
      > trying to figiure out why the later MS compiler is slower I tried
      > explicitly specifyig to use the non-multithreaded version of the
      > run-time lib (the threaded version has to use mutexes around all heap
      > operations and that can get very expensive) and yet that didn't help.
      > But I still am guessing that your help operations are costly because I
      > see the size of the program go up and down so much.
      >

      Well, it naturally goes down after it finishes trying to solve a board, as it
      frees all the resources that were used to solve this board. However, the
      memory consumption of Freecell Solver is heavily optimized to be as small as
      possible. Still, solving Freecell may require a lot of memory regardless of
      how good the ratio of your memory consumption is, as some boards can require
      checking quite a lot of derived boards.

      Another way to explain why things go awry on Windows, is that Freecell Solver
      uses the ANSI C Standard Library routines. These routines are relatively
      natively handled on UNIX, while emulated on Win32. The emulation may cause
      some problems.

      Regards,

      Shlomi Fish


      ---------------------------------------------------------------------
      Shlomi Fish shlomif@...
      Homepage: http://shlomif.il.eu.org/

      Quidquid latine dictum sit, altum viditur.
      [Whatever is said in Latin sounds profound.]
    • Randall Parker
      ... The truncated Maybe : I don t know why that happened. Here s my speculation: I realize that the solver has to use a lot of memory to check out lots of
      Message 2 of 6 , Apr 23 8:44 AM
      View Source
      • 0 Attachment
        Shlomi Fish wrote:
        5) Your solver takes about 70% longer to run with the MS Visual C++
        Toolkit 2003 as compared to MSVC v6. I've tried all sorts of compiler
        flags and haven't figured out why yet. That web page claims it is
        optiimizing. I've tried
           CC=cl /GL /G7 /Ox
         and it was even slightly (a couple of percent) slower than not having
        all that there. I do not know why this is the case. The later compiler
        from MS is supposed to generate better code. Maybe
        
            
        Maybe what? In any case, I don't see a reason why it should be so worse. It's 
        quite interesting.

        The truncated "Maybe":  I don't know why that happened.

        Here's my speculation: I realize that the solver has to use a lot of memory to check out lots of moves. Well, the problem is not the sheer amount of memory used. Rather for some reason perhaps the newer compiler is slower at free/malloc. It might cache less and turn more back over to the OS and thereby cause more context switches into the OS kernel. Those are costly.

        You might consider implementing a way to do fewer larger malloc calls that then hand out smaller chunks of memory. I don't know the structure of your code and how difficult that would be to do.

        If you want to suggest any build or run flags for me to try that might provide insight into the MS Visual C++ Toolkit 2003 performance problem I'd be happy to try them.
      • Shlomi Fish
        ... Well, free and malloc are implemented in the Microsoft Standard C Run-Time Library which is a DLL that is common to the new and old compiler. Lower level
        Message 3 of 6 , Apr 23 11:37 AM
        View Source
        • 0 Attachment
          On Friday 23 April 2004 18:44, Randall Parker wrote:
          > Shlomi Fish wrote:
          > >>5) Your solver takes about 70% longer to run with the MS Visual C++
          > >>Toolkit 2003 as compared to MSVC v6. I've tried all sorts of compiler
          > >>flags and haven't figured out why yet. That web page claims it is
          > >>optiimizing. I've tried
          > >> CC=cl /GL /G7 /Ox
          > >> and it was even slightly (a couple of percent) slower than not having
          > >>all that there. I do not know why this is the case. The later compiler
          > >>from MS is supposed to generate better code. Maybe
          > >
          > >Maybe what? In any case, I don't see a reason why it should be so worse.
          > > It's quite interesting.
          >
          > The truncated "Maybe": I don't know why that happened.
          >
          > Here's my speculation: I realize that the solver has to use a lot of
          > memory to check out lots of moves. Well, the problem is not the sheer
          > amount of memory used. Rather for some reason perhaps the newer compiler
          > is slower at free/malloc. It might cache less and turn more back over to
          > the OS and thereby cause more context switches into the OS kernel. Those
          > are costly.

          Well, free and malloc are implemented in the Microsoft Standard C Run-Time
          Library which is a DLL that is common to the new and old compiler. Lower
          level functions reside in lower-level DLLs (like KRNL32.DLL) which are also
          common. Freecell Solver is dynamically linked to the ANSI C library and the
          rest of the libraries, so there should not be a difference between it and the
          FCS compiled with the other compiler in this regard. But with Microsoft's
          products everything is possible... ;-)

          >
          > You might consider implementing a way to do fewer larger malloc calls
          > that then hand out smaller chunks of memory. I don't know the structure
          > of your code and how difficult that would be to do.
          >

          This is actually being done for every possible resource. You can tweak it
          using the following defines:

          ALLOCED_SIZE in alloc.c - currently set to somewhat below 8K. The reason it is
          not 8K, is because that due to the behaviour of malloc/free/realloc on UNIX
          (and possibly on Windows as well), memory allocated as a power of 2, will
          physically allocate a block twice as large, due to the fact that a small
          memory overhead, that is adjacent to the allocated memory, is used to keep
          meta-information about the block.

          The second place is:

          hard_thread->state_pack_len (a variable not a macro) in fcs_isa.c:

          Change the line:

          hard_thread->state_pack_len = 0x010000 / sizeof(fcs_state_with_locations_t);

          To assign some other value and you can change the amount of states that fit
          within this memory segment. Having written it I still don't know why I did
          not say:

          hard_thread->state_pack_len =
          0x010000 / sizeof(fcs_state_with_locations_t) - 1;

          there. (to be certain that there's a place for the overhead).

          You can tweak them and see if it improves things.

          > If you want to suggest any build or run flags for me to try that might
          > provide insight into the MS Visual C++ Toolkit 2003 performance problem
          > I'd be happy to try them.

          Can't think of any except what I said, sorry.

          Regards,

          Shlomi Fish

          --

          ---------------------------------------------------------------------
          Shlomi Fish shlomif@...
          Homepage: http://shlomif.il.eu.org/

          Quidquid latine dictum sit, altum viditur.
          [Whatever is said in Latin sounds profound.]
        • Shlomi Fish
          Oh and I forgot. Can you try to compile other ANSI C and Win32 API programs with it? Maybe the problem is not isolated to Freecell Solver. Regards, Shlomi Fish
          Message 4 of 6 , Apr 23 11:39 AM
          View Source
          • 0 Attachment
            Oh and I forgot.

            Can you try to compile other ANSI C and Win32 API programs with it? Maybe the
            problem is not isolated to Freecell Solver.

            Regards,

            Shlomi Fish

            On Friday 23 April 2004 21:37, Shlomi Fish wrote:
            >
            > Well, free and malloc are implemented in the Microsoft Standard C Run-Time
            > Library which is a DLL that is common to the new and old compiler. Lower
            > level functions reside in lower-level DLLs (like KRNL32.DLL) which are also
            > common. Freecell Solver is dynamically linked to the ANSI C library and the
            > rest of the libraries, so there should not be a difference between it and
            > the FCS compiled with the other compiler in this regard. But with
            > Microsoft's products everything is possible... ;-)
            >

            --

            ---------------------------------------------------------------------
            Shlomi Fish shlomif@...
            Homepage: http://shlomif.il.eu.org/

            Quidquid latine dictum sit, altum viditur.
            [Whatever is said in Latin sounds profound.]
          Your message has been successfully submitted and would be delivered to recipients shortly.