Loading ...
Sorry, an error occurred while loading the content.
 

Re: [XP] Physical System Analogy

Expand Messages
  • nevin@smalltalk.org
    ... example, has a ... code, ... So, Ron, is that a vote FOR or AGAINST SmallLint? Or, are you saying that you don t think SmallLint is a run-time tool for
    Message 1 of 25 , Jul 2, 2001
      --- In extremeprogramming@y..., Ron Jeffries <ronjeffries@a...> wrote:
      > Responding to Arrizza, John (10:17 AM 7/2/2001 -0400):
      > > > -----Original Message-----
      > > > From: Ron Jeffries [mailto:ronjeffries@a...]
      >
      > Tools can definitely be useful. The Refactoring Browser, for
      example, has a
      > "lint" for Smalltalk.
      >
      > I'm not sure why it would be a run-time tool to detect unclear
      code,
      > however. Can you clear that up for me?
      >
      >
      > Ronald E Jeffries
      > http://www.XProgramming.com
      > http://www.objectmentor.com

      So, Ron, is that a vote FOR or AGAINST SmallLint? Or, are you saying
      that you don't think SmallLint is a run-time tool for detecting
      unclear code?

      I'm confused.

      Nevin
    • Arrizza, John
      ... Run-time because the context in which complex parts of the code show themselves best occurs at run-time. I may be mistaken but I believe that static
      Message 2 of 25 , Jul 2, 2001
        > -----Original Message-----
        > From: Ron Jeffries [mailto:ronjeffries@...]
        > I'm not sure why it would be a run-time tool to detect unclear code,
        > however. Can you clear that up for me?
        :)

        Run-time because the context in which complex parts of the code show
        themselves best occurs at run-time. I may be mistaken but I believe that
        static analysis tools (such as Lint) depend on there being a strong
        correlation between what the source "looks like" and what actually occurs at
        run-time. In most cases there *is* a strong correlation between the two.
        Some tools go so far as to simulate running parts of the code to ensure the
        correlation is there.

        But in many cases, the correlation just isn't there. The example that comes
        to mind is multi-threaded code. It can look trivial statically but can have
        a much more complicated run-time "footprint". (that apparent simplicity, I
        think, leads people to not spend time looking at it, and that explains the
        prevalence of race conditions.)

        In the absence of run-time analysis tools, what can be used to find the
        hotspots in an application? To determine that statically there would need to
        be a theory that correlates source code to bug location and consequently bug
        frequency. The best correlation that I know of for bug frequency is LOC. If
        that were true, refactoring a legacy app should be a simple matter of
        finding the methods/functions with the largest number of lines and
        refactoring those mercilessly.

        I've tried that. My experience is that refactoring in a shotgun pattern
        works and it doesn't work: the app gets "better" but I didn't always get to
        the parts of the code that had the core problems.

        The shotgun approach also works only if the app is small. Otherwise there's
        just too much "space" between the refactored pieces of code to have a
        significant effect on the app overall.

        Just my $.02,
        John
      • Michael Schneider
        John, We use several axis of data to determine where refactoring dollars are spent. Note: you use different function/per axis depending on what your business
        Message 3 of 25 , Jul 2, 2001
          John,

          We use several axis of data to determine where refactoring dollars
          are spent. Note: you use different function/per axis depending
          on what your business goals are for the current release.

          (FYI, we use data gathering tools because of the size of our code
          base >10Mil LOC :-{

          Here are the Axis that we are working with:

          1) Design Goals of Architecture - what do you want your architecture to look like


          2) Actual Design of Architecture - What is the current structure
          (static model)

          3) Binary Dependency Structure - who calls who, how many times, (hot spot
          from a link dependency)

          4) Compile Dependency Structure - To compile file X, how much of the world
          to I have to compile

          5) Run Time Path Data - We run system tests against the code to find out which
          areas are exercised when user scenarios are executed


          Example Analysis from 5-Axis Data Model

          You are chartered to refactor subsystem XYZ, and develop the module so that
          you can swap out algorithms. This subsystem is ~750,000 LOC, how do you
          start? Answer, gather the data for each axis defined above, here are the results:

          Data from axis 1) is where you would like to end up (nice component system)

          Data from axis 2) shows that you have no interface classes in subsystem XYX,

          Data from axis 3) indicates that the API to subsystem XYZ is being ignored,
          and that people are calling directly into implementation functions

          Data from axis 4) reinforces the no interface class data from axis 2), The result of this
          is that the whole world must be recompiled when a concrete implementation is
          changed.


          Where do you start ?

          Step 1: Set Priorities:
          - You don't want to bring down the system while you are working
          - You have a rich set of test cases for the existing algorithms, so
          this guides you to refactoring the form first, then function
          - Since you are doing form first, find 3-5 algorithms that will be swapped
          in and out for the new-plugable XYX subsystem
          - Spike the framework for these Algs to get a handle on what the API will
          need to be.
          - Use the data from axis 3) to find out who the customers are of subsystem
          XYZ, and determine what functions are used, and create a histogram of the
          number of times each function is called
          - Use the data from axis 5) to determine which functions are called by user scenarios, and create a histogram to find the hot
          functions
          - Revisit your API with this data to make sure that the "Hot" functions are
          cheap to call
          - Define your API, Implement interface classes, realize interface functionality
          with existing functionality
          - modify calling routines to use your api
          - run test suite


          Congratulations!!!!!!!!!!!!! you now have a subsystem that is independent of
          the rest of the system, except through your interface API layer, you will need
          to enter new rules for axis 1) to define allowed dependencies for you system.
          Anyone that bypasses the API will now be flagged as a design violation
          in axis 2).

          Now refactor away and have fun

          ----------------------------------------------------
          How to pick areas to focus on?

          Refactoring selections takes two approaches for us:

          - How much are we spending today in each subsystem (defect tracking
          against code)
          - What areas of the architecture will be impacted the most by the projects
          for the next release.

          That data allows a business decision to be made on which systems get
          refactoring effort.




          Have fun, and hope this helps,
          Mike



          "Arrizza, John" wrote:

          > > -----Original Message-----
          > > From: Ron Jeffries [mailto:ronjeffries@...]
          > > I'm not sure why it would be a run-time tool to detect unclear code,
          > > however. Can you clear that up for me?
          > :)
          >
          > Run-time because the context in which complex parts of the code show
          > themselves best occurs at run-time. I may be mistaken but I believe that
          > static analysis tools (such as Lint) depend on there being a strong
          > correlation between what the source "looks like" and what actually occurs at
          > run-time. In most cases there *is* a strong correlation between the two.
          > Some tools go so far as to simulate running parts of the code to ensure the
          > correlation is there.
          >
          > But in many cases, the correlation just isn't there. The example that comes
          > to mind is multi-threaded code. It can look trivial statically but can have
          > a much more complicated run-time "footprint". (that apparent simplicity, I
          > think, leads people to not spend time looking at it, and that explains the
          > prevalence of race conditions.)
          >
          > In the absence of run-time analysis tools, what can be used to find the
          > hotspots in an application? To determine that statically there would need to
          > be a theory that correlates source code to bug location and consequently bug
          > frequency. The best correlation that I know of for bug frequency is LOC. If
          > that were true, refactoring a legacy app should be a simple matter of
          > finding the methods/functions with the largest number of lines and
          > refactoring those mercilessly.
          >
          > I've tried that. My experience is that refactoring in a shotgun pattern
          > works and it doesn't work: the app gets "better" but I didn't always get to
          > the parts of the code that had the core problems.
          >
          > The shotgun approach also works only if the app is small. Otherwise there's
          > just too much "space" between the refactored pieces of code to have a
          > significant effect on the app overall.
          >
          > Just my $.02,
          > John
          >
          > To Post a message, send it to: extremeprogramming@...
          >
          > To Unsubscribe, send a blank message to: extremeprogramming-unsubscribe@...
          >
          > Don't miss XP UNIVERSE, the first US conference on XP and Agile Methods. see www.xpuniverse.com for details and registration.
          >
          > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
        • Ron Jeffries
          ... I like SmallLint. And it is not a run-time tool because it lints your program while your program is not running. A profiler would be a run-time tool, as it
          Message 4 of 25 , Jul 2, 2001
            Responding to nevin@... (02:50 PM 7/2/2001 +0000):
            >So, Ron, is that a vote FOR or AGAINST SmallLint? Or, are you saying
            >that you don't think SmallLint is a run-time tool for detecting
            >unclear code?
            >
            >I'm confused.

            I like SmallLint. And it is not a run-time tool because it lints your
            program while your program is not running. A profiler would be a run-time
            tool, as it analyzes the performance of my program as it runs. Lint (the
            original) runs on the source. SmallLint runs by reflection, but my program
            is not executing when SmallLint analyzes it.

            Does that make sense?



            Ronald E Jeffries
            http://www.XProgramming.com
            http://www.objectmentor.com
          • Ron Jeffries
            ... It sure seems in your report that you must be doing refactoring in very big chunks: you actually _schedule_ it. What might happen if you refactored all the
            Message 5 of 25 , Jul 2, 2001
              Responding to Michael Schneider (02:22 PM 7/2/2001 -0400):

              >We use several axis of data to determine where refactoring dollars
              >are spent. Note: you use different function/per axis depending
              >on what your business goals are for the current release.

              It sure seems in your report that you must be doing refactoring in very big
              chunks: you actually _schedule_ it.

              What might happen if you refactored all the time instead?

              Ronald E Jeffries
              http://www.XProgramming.com
              http://www.objectmentor.com
            • nevin@smalltalk.org
              ... saying ... your ... run-time ... (the ... program ... Ah, yes. You are right of course. The image approach used by Smalltalk often makes me forget this
              Message 6 of 25 , Jul 2, 2001
                --- In extremeprogramming@y..., Ron Jeffries <ronjeffries@a...> wrote:
                > Responding to nevin@s... (02:50 PM 7/2/2001 +0000):
                > >So, Ron, is that a vote FOR or AGAINST SmallLint? Or, are you
                saying
                > >that you don't think SmallLint is a run-time tool for detecting
                > >unclear code?
                > >
                > >I'm confused.
                >
                > I like SmallLint. And it is not a run-time tool because it lints
                your
                > program while your program is not running. A profiler would be a
                run-time
                > tool, as it analyzes the performance of my program as it runs. Lint
                (the
                > original) runs on the source. SmallLint runs by reflection, but my
                program
                > is not executing when SmallLint analyzes it.
                >
                > Does that make sense?

                Ah, yes. You are right of course. The image approach used by
                Smalltalk often makes me forget this distinction. But you are right--
                the new code introduced into the image is not exercised by SmallLint.

                Nevin
              • Michael Schneider
                ... Ron, Very good question, I will do my best to answer it. Our code base has evolved over the last 20 years, it all tended to be developed with the best
                Message 7 of 25 , Jul 2, 2001
                  Ron Jeffries wrote:

                  > Responding to Michael Schneider (02:22 PM 7/2/2001 -0400):
                  >
                  > >We use several axis of data to determine where refactoring dollars
                  > >are spent. Note: you use different function/per axis depending
                  > >on what your business goals are for the current release.
                  >
                  > It sure seems in your report that you must be doing refactoring in very big
                  > chunks: you actually _schedule_ it.
                  >
                  > What might happen if you refactored all the time instead?

                  Ron,

                  Very good question, I will do my best to answer it.

                  Our code base has evolved over the last 20 years, it all tended to be
                  developed with the "best practice" of the day. Some areas are very
                  cheap to maintain, so they don't appear on the cost radar for architecture.
                  These portions of the code are mature feature wise, so you probably won't
                  go into them for new projects.

                  Other areas have a maintenance cost, but there is no new added functionality
                  to go there, so that would be a candidate for a scheduled refactor.
                  The maintenance resource model is high 'nuf to put together a hit squad
                  for a couple of weeks, to do some refactoring. Again the reason for the
                  scheduling is that the functionality is mature, it is the maintenance cost
                  that drives it, not new functionality.

                  S.W.A.G Warning!!!!!!!!!!!!!!!!!!

                  If the data indicates that the team would spend more then 40-60% of their
                  time refactoring the old system, rather than adding new functionality,
                  then it may be a scheduled refactor (really swagging here :-{ )

                  Other refactoring goes with new projects, this is not scheduled as a separate task,
                  developers are in that code, so they can clean up things that get in their way.

                  In project mode, refactoring is always a possibility at any time. The 5 axis
                  give people a set of data points to help do estimation at the beginning of a project.

                  The axis data is also useful to get metrics to justify 10 people refactoring
                  for a major portion of the project.

                  Again a VERY rough metric is that we try to keep legacy system refactoring
                  below 40-60% for a scheduled project, it just wacks the velocity too much to be able to predict outcome for the new features.

                  Not much of a science on that call, just a tradeoff between fixed release date, and
                  fixed resource.

                  One very important point, it is much harder to justify a refactoring project that
                  a new functionality project, so it is very import to refactor as part of your
                  project. I would not hope to be able to schedule a refactor for the project
                  that I am working on this release, it is imperative to do it right the first time.


                  I wish that I could give you more that S.W.A.G's but that is about all that I have
                  right now,
                  Mike


                  >
                  >
                  > Ronald E Jeffries
                  > http://www.XProgramming.com
                  > http://www.objectmentor.com
                  >
                  > To Post a message, send it to: extremeprogramming@...
                  >
                  > To Unsubscribe, send a blank message to: extremeprogramming-unsubscribe@...
                  >
                  > Don't miss XP UNIVERSE, the first US conference on XP and Agile Methods. see www.xpuniverse.com for details and registration.
                  >
                  > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
                • Arrizza, John
                  ... model) ... Using a coverage tool I assume. How did you come up with the axes?
                  Message 8 of 25 , Jul 2, 2001
                    > -----Original Message-----
                    > From: Michael Schneider [mailto:michael.schneider@...]
                    > Here are the Axis that we are working with:
                    >
                    > 1) Design Goals of Architecture - what do you want your
                    > architecture to look like
                    >
                    > 2) Actual Design of Architecture - What is the current structure (static
                    model)
                    >
                    > 3) Binary Dependency Structure - who calls who, how many
                    > times, (hot spot from a link dependency)
                    >
                    > 4) Compile Dependency Structure - To compile file X, how much
                    > of the world to I have to compile
                    >
                    > 5) Run Time Path Data - We run system tests against the code
                    > to find out which areas are exercised when user scenarios are executed
                    Using a coverage tool I assume.

                    How did you come up with the axes?
                  • Arrizza, John
                    ... Is there a correlation between probabilities of events and the locality of defects or of complex code?
                    Message 9 of 25 , Jul 2, 2001
                      > -----Original Message-----
                      > From: Hugo Garcia [mailto:xpjava@...]
                      > Still a casual description but by measuring the
                      > probrabilities of events you avoid the mechanistic
                      > view of things.

                      Is there a correlation between probabilities of events and the locality of
                      defects or of complex code?
                    • Michael Schneider
                      ... Axis 1) Is the architecture design, It is basically a UML Diagram with tools to enforce package dependency. Robert Martin from Object Mentor helped us
                      Message 10 of 25 , Jul 2, 2001
                        "Arrizza, John" wrote:

                        > > -----Original Message-----
                        > > From: Michael Schneider [mailto:michael.schneider@...]
                        > > Here are the Axis that we are working with:
                        > >
                        > > 1) Design Goals of Architecture - what do you want your
                        > > architecture to look like
                        > >
                        > > 2) Actual Design of Architecture - What is the current structure (static
                        > model)
                        > >
                        > > 3) Binary Dependency Structure - who calls who, how many
                        > > times, (hot spot from a link dependency)
                        > >
                        > > 4) Compile Dependency Structure - To compile file X, how much
                        > > of the world to I have to compile
                        > >
                        > > 5) Run Time Path Data - We run system tests against the code
                        > > to find out which areas are exercised when user scenarios are executed
                        > Using a coverage tool I assume.
                        >
                        > How did you come up with the axes?

                        Axis 1) Is the architecture design, It is basically a UML Diagram with tools to
                        enforce package dependency. Robert Martin from Object Mentor helped us
                        with the techniques for Axis 1)

                        Axis 2) This axis came from the impact of our legacy system on the new design,
                        we would come up with great designs for Axis 1, but when you came to realize
                        them in the context of the system, you were challenged. Axis 2 was a short cut
                        to allows have everybody that knew how that old system "really-worked"
                        in one room. By automating this, and visualizing it, the architecture could
                        review the "as-is" and say Whoah There!!!, that is not quite right. It is the
                        big picture tool of the architecture

                        Axis 3) This was gleaned from the exe's and the library archives, it was the
                        call tree, for each exe

                        Axis 4) came from #include info, our tools group has been gather this info for
                        ~8 years, we just had to mine what they already had, source navigator is
                        a nice Free tool to get this kind of information

                        Axis 5) This came from our system test data, this is data contributed by customers,
                        and defects over time. This also has beta test ...... This is to try to get as close to a
                        "customer-oriented" view as possible.

                        The data from the 5 axis are relatively easy to get, the hard part was visualization,
                        and axis relationships. graphVis and java3d can help with this.

                        This approach seems to hold up, it is the relationship between the axis that
                        we are working on most now.

                        It is just 5 different meta-models of the architecture, no one is perfect, but
                        together, they can give you good info on the state of your system.

                        Don't rely on it to heavily though, it just give you approximate state
                        of the system. Number can lull you into a false sense of security,
                        or a false panic.

                        Hope this helps,
                        Mike


                        >
                        >
                        > To Post a message, send it to: extremeprogramming@...
                        >
                        > To Unsubscribe, send a blank message to: extremeprogramming-unsubscribe@...
                        >
                        > Don't miss XP UNIVERSE, the first US conference on XP and Agile Methods. see www.xpuniverse.com for details and registration.
                        >
                        > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
                      • Arrizza, John
                        ... Actually, I meant why those axes and not some other set? John
                        Message 11 of 25 , Jul 2, 2001
                          > -----Original Message-----
                          > From: Michael Schneider [mailto:michael.schneider@...]
                          > > How did you come up with the axes?
                          >
                          > Axis 1) Is the architecture design, It is basically a UML
                          <snip>

                          Actually, I meant why those axes and not some other set?
                          John
                        • Ron Jeffries
                          ... The word legacy would have answered my question simply but without deep understanding of what you do. Your detailed remarks are very helpful. Thanks!
                          Message 12 of 25 , Jul 2, 2001
                            Responding to Michael Schneider (03:37 PM 7/2/2001 -0400):

                            >Very good question, I will do my best to answer it.
                            >
                            >Our code base has evolved over the last 20 years ...

                            The word "legacy" would have answered my question simply but without deep
                            understanding of what you do. Your detailed remarks are very helpful. Thanks!



                            Ronald E Jeffries
                            http://www.XProgramming.com
                            http://www.objectmentor.com
                          • Michael Schneider
                            John, Those were developed over time to meet business/technical needs. We looked at the problems that we were facing, looked at the work others were doing,
                            Message 13 of 25 , Jul 2, 2001
                              John,

                              Those were developed over time to meet business/technical needs.
                              We looked at the problems that we were facing, looked at the work
                              others were doing, then tried several things over time.

                              We picked axis that had some overlap, but gave a look at the architecture
                              from a different view. The overlap allowed relationships to be established,
                              the differences allowed views that were not possible without the
                              axis.

                              You may be able to get similar info from different axis, this set seems
                              to work well for us so far. Next year it may be the 7 axis, or we may
                              figured out how to merge 2 into 1 and then it would be the 4 axis.

                              With the relationships that we have now, it takes 5.

                              Hope this helps,
                              Mike



                              "Arrizza, John" wrote:

                              > > -----Original Message-----
                              > > From: Michael Schneider [mailto:michael.schneider@...]
                              > > > How did you come up with the axes?
                              > >
                              > > Axis 1) Is the architecture design, It is basically a UML
                              > <snip>
                              >
                              > Actually, I meant why those axes and not some other set?
                              > John
                              >
                              > To Post a message, send it to: extremeprogramming@...
                              >
                              > To Unsubscribe, send a blank message to: extremeprogramming-unsubscribe@...
                              >
                              > Don't miss XP UNIVERSE, the first US conference on XP and Agile Methods. see www.xpuniverse.com for details and registration.
                              >
                              > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
                            • Hugo Garcia
                              Hmmmmm...... Good question. I will have to ponder for a while. -H ... __________________________________________________ Do You Yahoo!? Get personalized email
                              Message 14 of 25 , Jul 2, 2001
                                Hmmmmm......

                                Good question. I will have to ponder for a while.

                                -H



                                --- "Arrizza, John" <john.arrizza@...> wrote:
                                > > -----Original Message-----
                                > > From: Hugo Garcia [mailto:xpjava@...]
                                > > Still a casual description but by measuring the
                                > > probrabilities of events you avoid the mechanistic
                                > > view of things.
                                >
                                > Is there a correlation between probabilities of
                                > events and the locality of
                                > defects or of complex code?
                                >
                                > To Post a message, send it to:
                                > extremeprogramming@...
                                >
                                > To Unsubscribe, send a blank message to:
                                > extremeprogramming-unsubscribe@...
                                >
                                > Don't miss XP UNIVERSE, the first US conference on
                                > XP and Agile Methods. see www.xpuniverse.com for
                                > details and registration.
                                >
                                > Your use of Yahoo! Groups is subject to
                                > http://docs.yahoo.com/info/terms/
                                >
                                >


                                __________________________________________________
                                Do You Yahoo!?
                                Get personalized email addresses from Yahoo! Mail
                                http://personal.mail.yahoo.com/



                                [Non-text portions of this message have been removed]
                              • Arrizza, John
                                ... I couldn t help but notice the similarity of the axes you chose to Lakos s recommendations in Large Scale C++ Development.
                                Message 15 of 25 , Jul 3, 2001
                                  > -----Original Message-----
                                  > From: Michael Schneider [mailto:michael.schneider@...]
                                  > Those were developed over time to meet business/technical needs.

                                  I couldn't help but notice the similarity of the axes you chose to Lakos's
                                  recommendations in Large Scale C++ Development.
                                • Arrizza, John
                                  ... Just a thought: If the probability of an event is high then the likelihood of a defect is low. In other words, if an event occurs a lot then defects, if
                                  Message 16 of 25 , Jul 6, 2001
                                    > -----Original Message-----
                                    > From: Hugo Garcia [mailto:xpjava@...]
                                    > Good question. I will have to ponder for a while.
                                    > --- "Arrizza, John" <john.arrizza@...> wrote:
                                    > > > -----Original Message-----
                                    > > > From: Hugo Garcia [mailto:xpjava@...]
                                    > > > Still a casual description but by measuring the
                                    > > > probrabilities of events you avoid the mechanistic
                                    > > > view of things.
                                    > >
                                    > > Is there a correlation between probabilities of
                                    > > events and the locality of defects or of complex code?

                                    Just a thought: If the probability of an event is high then the likelihood
                                    of a defect is low. In other words, if an event occurs a lot then defects,
                                    if any, will manifest themselves more often therefore they tend to get
                                    fixed. (This also seems to jive with my experience.)

                                    In a sense, this is one of the reasons behind Unit Testing. It causes events
                                    to occur with (almost) even probability.

                                    back to legacy systems:
                                    On first blush, this implies that the first Unit Tests should be written for
                                    the least used code. But perhaps there is low-use code and then there is
                                    low-use code. An example of the latter is dead code. No point in unit
                                    testing that. Ditto but less so for code that implements low priority/low
                                    use features.

                                    So what's left? Code that implements high-use features but is invoked "once
                                    in a while".

                                    And that implies this snippet:
                                    1) Identify and remove all dead code by running a line coverage tool while
                                    running the system over all features.
                                    2) Identify code implementing high-use features by running a line coverage
                                    tool while those features are run.
                                    3) Write Unit Tests for the remaining code in all of the classes identified
                                    in step1.

                                    let me know what you think...
                                    John
                                  Your message has been successfully submitted and would be delivered to recipients shortly.