Loading ...
Sorry, an error occurred while loading the content.

Database Testing & Simple Code

Expand Messages
  • Simon Jones
    Hi, I m looking for a little or advice or some pearls of wisdom around managing testing data for XP projects which are focussed primarily around databases
    Message 1 of 3 , Jul 26, 2006
    View Source
    • 0 Attachment
      Hi,

      I'm looking for a little or advice or some pearls of wisdom around
      managing testing data for XP projects which are focussed primarily
      around databases (Oracle in particular, if thats relevant).

      Doubled pronged message as I'd also be interested in any thoughts on
      how to translate XPs idea of simple, clean code into DB development
      i.e. PL/SQL.

      I'll start with the latter.

      For some reason (just lucky I guess) most of the XP work I've been
      involved with seems to have revolved around an Oracle database.
      Ideas like refactoring, test-first, continuous integration etc have
      been readily applicable but one area in which I'm often confronted
      with a challenge is how to make SQL and PL/SQL 'simple'. I am far
      from being an expert with Oracle but it seems to me that convenional
      wisdom says that it is better to have one large SQL statement than
      many small ones when there is a direct choice. This appears to be
      primarily driven by performance.

      XP suggests that optimisation ought to be left until last i.e. only
      do it when you know you need it. However this doesn't seem to sit
      well with databases in general as, almost by defintion, they are
      likely to be the slowest aspect of an application. What I mean here
      is that whilst you may get away with a non-optimal solution in
      something like Java, the penalties in Oracle are more severe and
      more visible. Database that respond directly to human-interfaces
      (i.e. web based ones) are particularly good at 'advertising' non-
      optimal code.

      The problem for me however is that naturally I want maintainable
      code. Code that can be worked on by the whole team not just the
      Oracle guru. There often tends to be only one or two Oracle experts
      and they quickly get spread too thinly.

      A single SQL statement, can quickly become difficult to read, if not
      entirely unfathomable. Even though from a SQl perspective it is
      entirely correct and proper. (currently we have a datamodel based
      around the GOF composite pattern requiring tree-traversal through
      CONNECT BY, we are not in a position to make drastic structural
      changes to the database)

      Furthermore a single SQL statement is capable of returing a wide
      array of possible results involving complex joins across many
      tables. This presents quite a few challenges in terms of unit
      testing. (maintaining large data sets for one thing). There can be
      many permustation to a single result set and not checking them
      thoroughly quickly leads to pain later.

      This leads me nicely into my second question, how to maintain and
      organise the (typically) quite large sets of data required to
      adequately unit test a database.

      Obviously one needs to set some well known targets to verify your
      tests. However, if a given pair is to avoid coming up with such
      large data manually then naturally they want to write a query to
      extract the data. The problem here is that you almost end up writing
      the same query to extract the data as the query that will end up in
      your production code. This isn't therefore much of a test.

      The problem is further compounded because very often business data
      changes. The ideal situation is to be working with test data that is
      a close to production data as possible, but, armed with a barrage of
      unit tests, continually changing that data can quickly become
      cumbersome. Often the business also fails to be sufficiently
      rigorous about how it manages its standing data and production data.
      Unpleasant surprises abound.

      The same is true of acceptance tests. We want to be in a position
      where we can run acceptance tests against data that is relatively
      close to 'live'. Tricky to do if your data is constantly evolving.

      Some things that we have tried are:
      MULTISET functionality (I think this is Oracle specific). This turns
      out to be quite a good way to break queries down into something more
      maintainable. i.e. combining sets of results with INTESECT and
      UNION. But this is only applicable sometimes.

      Using more de-normalisation than your typical DB designer might be
      comfortable with. It does however require more 'management' code to
      ensure de-normalised data is kept in sync. This begins to violate
      the XP principles of write-once and simple, clean design.

      Writing automated scripts to move current data to newer production
      data. It can be done, but once datasets grow large the amount of
      development effort required to updated tests and 'morph' data can
      quickly become onerous.

      All of these help, but still it is far harder to apply XP to SQl
      code that other language counterparts, or so it seems.

      In many ways applying XP to databases actually promises the highest
      gains as traditionally database can quickly become 'legacy' systems
      through poor maintenance. But because the database has a more
      fundamental connection to data, so it would seem that the price for
      applying XP is harder.

      But as I say, I am not a database expert. Hopefully there are some
      interesting tips and trick out there.

      Cheers
      Simon
    • Anthony Williams
      ... I did this once with a very large Oracle database. What we did was to snapshot the database (which took all weekend, but the weekend was downtime anyway),
      Message 2 of 3 , Aug 1, 2006
      View Source
      • 0 Attachment
        "Simon Jones" <simon@...> writes:

        > Writing automated scripts to move current data to newer production
        > data. It can be done, but once datasets grow large the amount of
        > development effort required to updated tests and 'morph' data can
        > quickly become onerous.

        I did this once with a very large Oracle database. What we did was to snapshot
        the database (which took all weekend, but the weekend was downtime anyway),
        and add triggers that kept track of all changes made to the live data. We
        could then migrate the snapshot at our leisure, and ensure that the data was
        updated with the changes. Once the full data had been transferred to the new
        schema, we swapped the app backend to the version that used the new schema,
        and swapped the databases. This was not something that we would want to do
        very often, due to how long it took (about 2 weeks to do the full data
        migration), but it's certainly doable. If you can't manage the downtime for a
        snapshot, you can also just copy the data whilst the database is live, using
        the triggers to ensure that already-copied data is updated.

        What I've also done is create views that look like the old tables, with
        insert/update/delete triggers, so the database schema can change underneath,
        but the app still sees the old schema until the new version of the app can be
        released.

        Anthony
        --
        Anthony Williams
        Software Developer
        Just Software Solutions Ltd
        http://www.justsoftwaresolutions.co.uk
      • Ron Jeffries
        Hello Anthony, Thanks for your email. On Tuesday, August 1, 2006, at 6:50:11 AM, ... Ambler and Sadalage, /Refactoring Databases/ offers a number of ideas for
        Message 3 of 3 , Aug 1, 2006
        View Source
        • 0 Attachment
          Hello Anthony,

          Thanks for your email. On Tuesday, August 1, 2006, at 6:50:11 AM,
          you wrote:

          > "Simon Jones" <simon@...> writes:

          >> Writing automated scripts to move current data to newer production
          >> data. It can be done, but once datasets grow large the amount of
          >> development effort required to updated tests and 'morph' data can
          >> quickly become onerous.

          > I did this once with a very large Oracle database. What we did was to snapshot
          > the database (which took all weekend, but the weekend was downtime anyway),
          > and add triggers that kept track of all changes made to the live data. We
          > could then migrate the snapshot at our leisure, and ensure that the data was
          > updated with the changes. Once the full data had been transferred to the new
          > schema, we swapped the app backend to the version that used the new schema,
          > and swapped the databases. This was not something that we would want to do
          > very often, due to how long it took (about 2 weeks to do the full data
          > migration), but it's certainly doable. If you can't manage the downtime for a
          > snapshot, you can also just copy the data whilst the database is live, using
          > the triggers to ensure that already-copied data is updated.

          > What I've also done is create views that look like the old tables, with
          > insert/update/delete triggers, so the database schema can change underneath,
          > but the app still sees the old schema until the new version of the app can be
          > released.

          Ambler and Sadalage, /Refactoring Databases/ offers a number of
          ideas for morphing the database as we go. It seems to me to have
          some good ideas in it. For my review, see

          http://www.xprogramming.com/xpmag/bookRefactoringDatabases.htm

          Ron Jeffries
          www.XProgramming.com
          I have tried in my way to be free. -- Leonard Cohen
        Your message has been successfully submitted and would be delivered to recipients shortly.