Loading ...
Sorry, an error occurred while loading the content.
 

Re: [XP] is XP adapted for Business Intelligence applications

Expand Messages
  • Adam Sroka
    ... Do you really mean: How do we do TDD/Refactoring for this kind of app? That is only one small (but important) piece of XP. That said, it is certainly
    Message 1 of 18 , Jan 2 1:25 PM
      On Sat, Jan 2, 2010 at 4:29 AM, loicmidy <loicmidy@...> wrote:
      >
      > for the DSS applications I don't see how to do XP.
      > for example we are doing the developpment of an application with a lot of statisticals calculus and with a lot of data agregations in batch mode. the data is in an oracle database with a star schema.
      > we are going to do the batch using only SQL scripts. those scripts will be wrapped together with java or PL SQL. So the programmaing is NOT OOP.
      > In this context I don't see hox to do XP?
      > we also have others applications with data in OLAP
      >

      Do you really mean: "How do we do TDD/Refactoring for this kind of
      app?" That is only one small (but important) piece of XP.

      That said, it is certainly possible to do TDD for procedural code
      including PL/SQL. I have done it. It's not my favorite thing that I
      have done, but it is quite doable.

      Here is a list of unit test frameworks for PL/SQL:
      http://en.wikipedia.org/wiki/List_of_unit_testing_frameworks#PL.2FSQL

      When I did it I did not use any of those frameworks. Frameworks aren't
      actually a requirement for TDD. You can use an assert keyword if the
      language has that, or you can simply use comparison operators and
      exceptions or exit codes (I recently did some code katas in Google's
      Go which currently has neither an xUnit framework or an assert
      keyword. Worked just fine with full blown TDD.)

      For Refactoring I suggest you look at Scott Ambler's /Refactoring
      Databases/ book http://my.safaribooksonline.com/0321293533

      > Have you applied XP in DSS projets with ETL and OLAP?

      Not per se, but I've done some custom business intelligence stuff
      creating a data warehouse and mining it for specific reporting data.

      > Have you applied XP in DSS projets with huge calculus in batch mode (with a lot of data agregations) and with a star schema?
      >

      I'm not sure why the star schema would matter, but yes, I've worked on
      a project where a very large data warehouse was mined with complex
      algorithms in a batch process. I've also done a similar project that
      used asynchronous messaging rather than batch.

      The other aspect that might be complex is doing continuous builds. My
      suggestion would be to not unit test against the real data set, but
      use a much smaller known subset of data with an identical schema and
      procedures. You can even do acceptance testing against the smaller
      data set, though if you want to do performance/load testing you should
      do it against a copy of production.

      Unfortunately, testing PL/SQL does not lend itself to the
      mock/isolation style that some of us prefer for unit testing. That
      shouldn't be too big an obstacle, though. Just keep your procedures
      small and composed to a single level of detail, calling out to other
      procedures for progressively finer levels of detail - just like you
      would do with methods in OO.

      Hope that helps.
    • loicmidy
      Why is your organization moving to Scrum/XP? What does it hope to gain? What problems does it want to solve? response: Our current methodology is faterfall.
      Message 2 of 18 , Jan 3 4:46 AM
        Why is your organization moving to Scrum/XP? What does it hope to gain?
        What problems does it want to solve?

        response:
        Our current methodology is faterfall. Here are ours main problems with it:
        1 : we have a long study phase at the beginning of each project : between 1 year and 1 year and a half. Consequences : the duration of the project is long + the managers like me have little visibility during this phase. I hope to slash this phase to 6 month at maximum.
        2 : our customer are our statisticiens and they often ask for features that in fact they don't nead (we see that in the end). By giving frequent feedback to ours statisticiens and by prioritizinf features I hope we won't developp useless features.
        3 : nearly all of ours tests are manual. We have a lot of stress just before going into production and people have to do extra hours on manual testing. I hope to lower the stress and have a good ROI by doiing unit test and by automating most of the fonctionnals tests.
        4 : we don't measure the internal quality of our code. After a few years of production some applications become very hard to maintain. With a continous integration plateform I will be able to follow some basics metrics : number off lines of duplicated code; percentage of code covered with unit test, percentage of unit test passing.


        We haven't decided yet to move to Scrum/XP. We are doing a pilot in 2010.


        Up until now, how do you know whether or not the batch statistical
        calculations on large datasets is implement correctly.If you do not already have a way to make sure your software is correctlyimplemented, then that raises some interesting questions. How long has the organization been successfully functioning without any assurance that the software is actually working correctly? Does the organization really need the software to be correct to be successful? If whatever your software does is assumed by your customers to be correct (or is correct by definition), then does your organization really want to know when the software is not actually correct?

        response:
        We do a lot of testing and I'm sure the calculations are correct but this comes at a very high cost which we will not be able to pay in the future (we will have less civil servants).
        Example : last year we wrote a batch calculating the legal populations so this batch was very sensitive. The batch was developped in PL SQL. The statisticiens developped the same batch in SAS and did a full result comparison between the two batchs!!!
        we don't always do this. In others situations we do a lot of manual testing and as the application grows the cost of regression testing increase. For example, on an old application (years in production) we had an evolution to do : cost of developpement 20 days and cost of regression testing 50 days!!!


        If not (and I have run into organizations that really do not want to know),
        then going down the Scrum/XP path is dangerous. Scrum/XP will be blamed for
        the pre-existing problems doing it will uncover.

        If you are already have a way to make sure these calculations are
        implemented correctly, then you have to:
        - automate those tests (if they are not already) so that it is possible to
        quickly discover when any change or extension breaks those calculations,
        - add new automated tests for each new calculation/feature implemented going
        forward.

        The more real-time the tests are, the better, but starting with slow running
        tests is better than starting from scratch. Agile does not require a
        perfect starting point. The most important thing is for the team to have
        the motivation, authority and responsibility to keep trying to find ways to
        improve each and every iteration.

        response:
        The first two steps that we are going to do is :
        1 : write unit test but from the moment I don't see how to do it for Decision Support System (DSS) but I have tips in the others posts of reply. We have started to study it in 2009.
        2 : but in place a continous integration plateform in february 2010.
      • loicmidy
        Do you really mean: How do we do TDD/Refactoring for this kind of app? That is only one small (but important) piece of XP. Response : YES, that was my
        Message 3 of 18 , Jan 3 5:05 AM
          Do you really mean: "How do we do TDD/Refactoring for this kind of
          app?" That is only one small (but important) piece of XP.
          Response : YES, that was my question. I should have been more specific
          in my initial post.

          That said, it is certainly possible to do TDD for procedural code
          including PL/SQL. I have done it. It's not my favorite thing that I
          have done, but it is quite doable.

          Response : this feedback is very interesting because the books on TDD
          are essentially on object langage.

          For Refactoring I suggest you look at Scott Ambler's /Refactoring
          Databases/ book http://my.safaribooksonline.com/0321293533
          <http://my.safaribooksonline.com/0321293533>

          Response : thanks for the reference. There is another problem : in JAVA
          we have refactoring tools in eclipse that helps. I don't know such tools
          for PL SQL or SQL.

          Unfortunately, testing PL/SQL does not lend itself to the
          mock/isolation style that some of us prefer for unit testing. That
          shouldn't be too big an obstacle, though. Just keep your procedures
          small and composed to a single level of detail, calling out to other
          procedures for progressively finer levels of detail - just like you
          would do with methods in OO.



          Response : OK good advice but this seem to me that this is not truly
          unit testing because we can't test in isolation.

          Ex: procedure P1 make some work and uses procedure P2 and P3.In JAVA I
          would stub the P2 and P3 data needed in P1 and so test P1 in isolation
          from P2 and P3.In PL SQL this is not possible so if there is a bug in P2
          the « unit test » on P2 will fail but also the « unit test »
          on P1.



          [Non-text portions of this message have been removed]
        • Steven Gordon
          ... Why does the code have to be done in PL SQL? Is it because it has always been done this way, or because it is the skill your team has, or because any other
          Message 4 of 18 , Jan 3 8:41 AM
            On Sun, Jan 3, 2010 at 6:05 AM, loicmidy <loicmidy@...> wrote:
            >
            ...
            > Unfortunately, testing PL/SQL does not lend itself to the
            > mock/isolation style that some of us prefer for unit testing. That
            > shouldn't be too big an obstacle, though. Just keep your procedures
            > small and composed to a single level of detail, calling out to other
            > procedures for progressively finer levels of detail - just like you
            > would do with methods in OO.
            >
            > Response : OK good advice but this seem to me that this is not truly
            > unit testing because we can't test in isolation.
            >
            > Ex: procedure P1 make some work and uses procedure P2 and P3.In JAVA I
            > would stub the P2 and P3 data needed in P1 and so test P1 in isolation
            > from P2 and P3.In PL SQL this is not possible so if there is a bug in P2
            > the « unit test » on P2 will fail but also the « unit test »
            > on P1.
            >

            Why does the code have to be done in PL SQL?

            Is it because it has always been done this way, or because it is the
            skill your team has, or because any other language would be too slow
            for the large data sets?

            Would it be feasible to harness a flexible language like Ruby to do
            the following:

            - Interatively write a code generator that translates a limited subset
            of Ruby with imbedded SQL into PL SQL (do this XP style, testing that
            code generator extensively every time the limited subset gets extended
            as required by new user story),

            - Write unit-tested code in the above limited subset of Ruby with
            imbedded SQL (again XP style),

            - Use the code generator frequently for deployment to a test
            production environment and run acceptance tests on the results.

            For a single small project, such an approach might be too expensive.
            To support a series of large projects, the investment could pay for
            itself many times over.

            SteveG
          • strazhce
            Hi, Steve. ... This is an interesting approach. I imagine there would be considerable resistance to developing a conversion tool (I m an opponent of storage
            Message 5 of 18 , Jan 4 4:59 AM
              Hi, Steve.

              --- In extremeprogramming@yahoogroups.com, Steven Gordon <sgordonphd@...> wrote:
              >
              > Why does the code have to be done in PL SQL?
              >
              > Is it because it has always been done this way, or because it is the
              > skill your team has, or because any other language would be too slow
              > for the large data sets?
              >
              > Would it be feasible to harness a flexible language like Ruby
              This is an interesting approach. I imagine there would be considerable resistance to developing a conversion tool (I'm an opponent of storage procedures programming, but I would think twice before doing this).

              1. What are benefits of this approach?
              2. How this promotes better testability/maintainability?
              - how do you test such code?
              - how do you refactor such code?

              Thanks.
              Oleg
            • Steven Gordon
              ... [Non-text portions of this message have been removed]
              Message 6 of 18 , Jan 4 6:55 AM
                On Mon, Jan 4, 2010 at 5:59 AM, strazhce <infobox.oleg@...> wrote:

                >
                >
                > Hi, Steve.
                >
                >
                > --- In extremeprogramming@yahoogroups.com<extremeprogramming%40yahoogroups.com>,
                > Steven Gordon <sgordonphd@...> wrote:
                > >
                > > Why does the code have to be done in PL SQL?
                > >
                > > Is it because it has always been done this way, or because it is the
                > > skill your team has, or because any other language would be too slow
                > > for the large data sets?
                > >
                > > Would it be feasible to harness a flexible language like Ruby
                > This is an interesting approach. I imagine there would be considerable
                > resistance to developing a conversion tool (I'm an opponent of storage
                > procedures programming, but I would think twice before doing this).
                >
                > 1. What are benefits of this approach?
                > 2. How this promotes better testability/maintainability?
                > - how do you test such code?
                > - how do you refactor such code?
                >
                > Thanks.
                > Oleg
                >
                >
                >


                [Non-text portions of this message have been removed]
              • Steven Gordon
                ... Testability and maintainability. ... Unit test? You can unit test the code in Ruby by executing it. I believe the embedded SQL can be directly executed
                Message 7 of 18 , Jan 4 7:08 AM
                  On Mon, Jan 4, 2010 at 5:59 AM, strazhce <infobox.oleg@...> wrote:

                  >
                  >
                  > Hi, Steve.
                  >
                  >
                  > --- In extremeprogramming@yahoogroups.com<extremeprogramming%40yahoogroups.com>,
                  > Steven Gordon <sgordonphd@...> wrote:
                  > >
                  > > Why does the code have to be done in PL SQL?
                  > >
                  > > Is it because it has always been done this way, or because it is the
                  > > skill your team has, or because any other language would be too slow
                  > > for the large data sets?
                  > >
                  > > Would it be feasible to harness a flexible language like Ruby
                  > This is an interesting approach. I imagine there would be considerable
                  > resistance to developing a conversion tool (I'm an opponent of storage
                  > procedures programming, but I would think twice before doing this).
                  >
                  > 1. What are benefits of this approach?
                  >

                  Testability and maintainability.


                  > 2. How this promotes better testability/maintainability?
                  > - how do you test such code?
                  >

                  Unit test? You can unit test the code in Ruby by executing it. I believe
                  the embedded SQL can be directly executed (might be slow for large data
                  sets, but fine for small data sets used for unit testing). Mocking for code
                  isolation can be accomplished by mocking in Ruby.

                  Acceptance testing would be best done by generating target code and
                  deploying it.


                  > - how do you refactor such code?
                  >

                  It might not be automated, but the unit testing facilitates refactoring,
                  even if it is manual.

                  The issue is whether the cost and maintenance of the evolving target-code
                  generator is worth the ability to better support isolated unit-testing and
                  refactoring of the code. I am not totally convinced on this trade-off, so
                  it is just "a thought experiment".


                  >
                  > Thanks.
                  > Oleg
                  >
                  >
                  >
                  >


                  [Non-text portions of this message have been removed]
                • William Pietri
                  ... I think that s a point worth highlighting. Automated refactoring tools are undeniably awesome. But even if you have to refactor manually, it s still much
                  Message 8 of 18 , Jan 4 10:19 PM
                    On 01/04/2010 07:08 AM, Steven Gordon wrote:
                    > It might not be automated, but the unit testing facilitates refactoring,
                    > even if it is manual.
                    >

                    I think that's a point worth highlighting. Automated refactoring tools
                    are undeniably awesome. But even if you have to refactor manually, it's
                    still much better than the alternatives.

                    William
                  • JeffGrigg
                    [After all these years, it does seem that my posts are now moderated. How interesting. I do think that my contributions are constructive.] ... Yes. Anything
                    Message 9 of 18 , Jan 5 5:02 AM
                      [After all these years, it does seem that my posts are now moderated. How interesting. I do think that my contributions are constructive.]

                      > That said, it is certainly possible to do TDD for procedural
                      > code including PL/SQL.

                      Yes. Anything is possible. Some things are easier than others.

                      For xUnit testing of PL/SQL in PL/SQL, consider the following:
                      http://www.c2.com/cgi/wiki?PlUnit
                      and
                      http://www.c2.com/cgi/wiki?PlSqlUnit

                      But I suspect that given the tight binding of PL/SQL to the database and lack of support for injecting mock implementations, something that I would be likely to try for testing of complex PL/SQL-based applications is to do all testing in a development database where you have complete runtime control over the data and PL/SQL procedures, and be willing to have the tests repopulate the database and replace stored procedure code.

                      > Response : OK good advice but this seem to me that this is not truly
                      > unit testing because we can't test in isolation.
                      >
                      > Ex: procedure P1 make some work and uses procedure P2 and P3.
                      > In JAVA I would stub the P2 and P3 data needed in P1 and so
                      > test P1 in isolation from P2 and P3. In PL SQL this is not
                      > possible so if there is a bug in P2 the « unit test » on P2
                      > will fail but also the « unit test » on P1.

                      Unit testing is over-emphasized. Do integration testing.

                      IE: Given that the database is in a certain state and you call P1, expect the database to now be in a new state, and certain expected result sets are returned.

                      Also, if testing in a development database you own, you can replace procedures P2 and P3 with mock versions, call P1, and then restore the old versions of P2 and P3.


                      > Response : thanks for the reference. There is another problem:
                      > In JAVA we have refactoring tools in eclipse that helps. I
                      > don't know such tools for PL SQL or SQL.

                      You have the most powerful refactoring tool known to the industry: The Human Mind.

                      Yes, you can refactor SQL and PL/SQL. The lack of helpful tooling may be an annoyance, but that shouldn't stop you.

                      What would you do if you needed to do a refactoring in Java, but it didn't appear on the IDE's menu of supported refactorings?
                    • John Roth
                      ... I checked, and you ve got the same status as all other members. I m certainly not seeing any moderation messages for you, or any other member outside of
                      Message 10 of 18 , Jan 5 5:44 AM
                        JeffGrigg wrote:
                        >
                        >
                        > [After all these years, it does seem that my posts are now moderated.
                        > How interesting. I do think that my contributions are constructive.]
                        >
                        I checked, and you've got the same status as all other members.
                        I'm certainly not seeing any moderation messages for you, or
                        any other member outside of the usual first post moderation.

                        We seem to have a problem with timely delivery of messages. I've
                        noticed it on this and on other Yahoo groups I'm a member of.
                        I've seen some messages take as long as a couple of days to
                        clear whatever process Yahoo uses, and it's been going on for
                        several months.

                        John Roth
                        Moderator.
                      • Ron Jeffries
                        Hello, JeffGrigg. On Tuesday, January 5, 2010, at 8:02:21 AM, you ... Me too, and I m not aware of any moderation (on my part). Did you say F#$% or something?
                        Message 11 of 18 , Jan 5 6:19 AM
                          Hello, JeffGrigg. On Tuesday, January 5, 2010, at 8:02:21 AM, you
                          wrote:

                          > [After all these years, it does seem that my posts are now
                          > moderated. How interesting. I do think that my contributions are constructive.]

                          Me too, and I'm not aware of any moderation (on my part). Did you
                          say F#$% or something? :)

                          Ron Jeffries
                          www.XProgramming.com
                          www.xprogramming.com/blog
                          If it is more than you need, it is waste. -- Andy Seidl
                        • JeffGrigg
                          ... No. But I might, if it takes several days to post a message! ;- Perhaps my previous posts just got lost in some bit-bucket. Bother!
                          Message 12 of 18 , Jan 5 11:00 AM
                            --- John Roth <JohnRoth1@...> wrote:
                            > I've seen some messages take as long as a couple of days to
                            > clear whatever process Yahoo uses, and it's been going on for
                            > several months.

                            --- Ron Jeffries <ronjeffries@...> wrote:
                            > Me too, and I'm not aware of any moderation (on my part).
                            > Did you say F#$% or something? :)

                            No. But I might, if it takes several days to post a message! ;->

                            Perhaps my previous posts just got lost in some bit-bucket. Bother!
                            :-[
                          • D.André Dhondt
                            ... In my experience with DSS and T-SQL, TDD doesn t make a lot of sense for a 4GL like SQL, which doesn t really specify what we re going to do in a unit way.
                            Message 13 of 18 , Jan 6 9:09 PM
                              >
                              > > That said, it is certainly possible to do TDD for procedural
                              > > code including PL/SQL.
                              >

                              In my experience with DSS and T-SQL, TDD doesn't make a lot of sense for a
                              4GL like SQL, which doesn't really specify what we're going to do in a unit
                              way. We added tests, but they don't have the same kind of coverage as UTs.
                              Scaled-down dev/test databases don't make much sense either, since we
                              didn't really know what records the query was going to fetch, nor what the
                              performance implications would be, until we ran it in the production
                              environment. Even the execution paths turned out different on tables with
                              different statistics.

                              One of the aims of TDD is to make sure our units are right--that they are
                              what we intended. What we found is that by using smaller chunks, e.g.,
                              'extracting method' in our server-side code as much as possible, we
                              clarified our intent, we could add some integration tests/functional
                              tests/FIT tests, and basically we got more confidence in our database-side
                              code. We also found that by clarifying intent, we could simplify the
                              existing code, and that made significant gains in performance. Regression
                              testing was almost entirely manual, though.

                              Still, the biggest concern with db-side changes had to do with performance,
                              and we just couldn't find a way to provide sufficient test coverage for
                              that. So our strategy (there is a book written on this subject, db
                              refactoring or something, but I never read it) ended up being a strangler
                              pattern--isolate the functionality that needed to be on the database, add
                              some functional tests, and then just don't touch it unless it's important
                              enough to test on the production servers....





                              --
                              D. André Dhondt
                              mobile: 001 33 671 034 984
                              http://dhondtsayitsagile.blogspot.com/

                              Support low-cost conferences -- http://agiletour.org/
                              If you're in the area, join Agile Philly http://www.AgilePhilly.com
                              Mentor/be mentored: the Agile Skills Project
                              https://sites.google.com/site/agileskillsprojectwiki/


                              [Non-text portions of this message have been removed]
                            • Phlip
                              ... This is why one Ward Cunningham just tweeted Estimating is the non-problem that know-nothings spent decades trying to solve. I would have been more
                              Message 14 of 18 , Jan 9 10:14 PM
                                loicmidy wrote:

                                > Our current methodology is faterfall. Here are ours main problems with it:
                                > 1 : we have a long study phase at the beginning of each project : between 1 year and 1 year and a half. Consequences : the duration of the project is long + the managers like me have little visibility during this phase. I hope to slash this phase to 6 month at maximum.
                                > 2 : our customer are our statisticiens and they often ask for features that in fact they don't nead (we see that in the end). By giving frequent feedback to ours statisticiens and by prioritizinf features I hope we won't developp useless features.

                                This is why one Ward Cunningham just tweeted "Estimating is the non-problem that
                                know-nothings spent decades trying to solve."

                                I would have been more gentle to your waterfallers...
                              • Adam Sroka
                                ... Yeah. I had to retweet that one myself. ... I sat in on a meeting yesterday (Fri) where one of the teams my colleague is coaching was trying to understand
                                Message 15 of 18 , Jan 9 10:20 PM
                                  On Sat, Jan 9, 2010 at 10:14 PM, Phlip <phlip2005@...> wrote:
                                  >
                                  >
                                  >
                                  > loicmidy wrote:
                                  >
                                  > > Our current methodology is faterfall. Here are ours main problems with it:
                                  > > 1 : we have a long study phase at the beginning of each project : between 1 year and 1 year and a half. Consequences : the duration of the project is long + the managers like me have little visibility during this phase. I hope to slash this phase to 6 month at maximum.
                                  > > 2 : our customer are our statisticiens and they often ask for features that in fact they don't nead (we see that in the end). By giving frequent feedback to ours statisticiens and by prioritizinf features I hope we won't developp useless features.
                                  >
                                  > This is why one Ward Cunningham just tweeted "Estimating is the non-problem that
                                  > know-nothings spent decades trying to solve."
                                  >

                                  Yeah. I had to retweet that one myself.

                                  > I would have been more gentle to your waterfallers...
                                  >

                                  I sat in on a meeting yesterday (Fri) where one of the teams my
                                  colleague is coaching was trying to understand how to estimate. It was
                                  very painful. I made the point that they needed to do their best
                                  according to what they already seemed to understand, that their
                                  estimates would still be wrong, but that it would all work out over
                                  time. They looked at me like I was speaking some alien language (Which
                                  I suppose isn't far from the truth.)

                                  >
                                Your message has been successfully submitted and would be delivered to recipients shortly.