Loading ...
Sorry, an error occurred while loading the content.

Re: [agileDatabases] Re: Schema Versioning

Expand Messages
  • Milen Kovachev
    I also forgot to mention that programmables are kept in the so called repeatable scripts folder, where we keep one script per programmable and all the
    Message 1 of 41 , Aug 26, 2010
    • 0 Attachment
      I also forgot to mention that programmables are kept in the so called
      "repeatable" scripts folder, where we keep one script per programmable and
      all the changes to that programmable are versioned there, so we do not have
      to use numbered (incremental) scripts for every change to a
      programmable. DbMaintain
      does a very good job of efficiently finding when a programmable script has
      changed and deploys it only when it is different than the currently deployed
      one. It does that by keeping a check sum of the script content in the
      deployment history table so it can quickly find out which repeatable script
      has changed since the last deployment.

      Cheers,
      Milen

      On Thu, Aug 26, 2010 at 11:17 AM, Milen Kovachev <milenvk@...> wrote:

      > We have successfully used the numbered script approach with multiple
      > developers working simultaneously on the same database. We used DbMaintain
      > for deployment. The method is fairly simple, and does not require one person
      > to sync scripts, hence no one has had to quit ;) In fact developers quite
      > enjoy the way it works.
      >
      > Generally it works like this:
      >
      > - All database scripts are kept in source control in subfolders of a
      > database scripts folder. These folders have sequential numbers.
      > - When a developer starts a new effort she gets latest version from
      > source control and on her working copy creates a new numbered folder in the
      > database scripts folder. That folder has a name of whatever is the next
      > available sequential number.
      > - During her effort the developer puts database scripts in that folder
      > and deploys all scripts to her isolated database environment using
      > DbMaintain.
      > - When the developer is ready to commit to source control she updates
      > her working copy to the latest version.
      > - If in the meantime someone else used that folder name and already
      > committed to source control, then the developer changes her numbered folder
      > name on her working copy to have the next available number and moves her
      > scripts there.
      > - Then the developer executes the deployment, including the new
      > scripts, on her isolated local database and makes sure her scripts work well
      > with the latest database scripts.
      > - Then finally she commits (if in the meantime someone else committed
      > new database scripts, she increments her folder once more and redeploys the
      > scripts).
      > - At every developer's commit a continuous integration build process
      > picks up the latest scripts (and code), builds the latest version, deploys
      > the database changes to a staging database and sends email to everyone
      > involved with links to the new application.
      > - If a continuous integration build failed due to a database script
      > failure, the developer responsible for the failure fixes her scripts and
      > recommits (that almost never happens since every developer deploys the
      > scripts on their isolated environment before commit).
      >
      > Notice how the work of synchronizing the scripts is spread over all members
      > of the team. That also has the added benefit of members learning what others
      > are doing. This process is fail-proof as it relies on repeatability. This is
      > leveraged by the same deployments being executed over and over again on
      > different environments (dev machines, build machine, and QA), minimizing the
      > risk of something going wrong at production release.
      >
      > I hope that helps.
      >
      > Milen Kovachev
      >
      > On Thu, Aug 26, 2010 at 10:08 AM, emanuelbulic <emanuelbulic@...>wrote:
      >
      >>
      >>
      >> This approach doesn't work for me. Numbered scripts? I have 10000 of them.
      >> What happens when you have 20 developers, and they all need scripts? You
      >> must dedicate a person to handle these scripts, and that person will quit
      >> after 2 weeks of doing that.
      >>
      >> Numbered scripts don't handle programmables either. I've already
      >> discounted all approaches that are script-centric. You also need a way to
      >> drop/recreate all views, functions, sp's, triggers, indexes, etc.
      >>
      >> Also, oracle doesn't support rollback of DDL statements. MS SQL Server
      >> does, but not oracle.
      >>
      >>
      >> --- In agileDatabases@yahoogroups.com <agileDatabases%40yahoogroups.com>,
      >> "timander37" <timander@...> wrote:
      >> >
      >> > We handle this a little bit differently. If any SQL change script
      >> > fails, the database is rolled back and the upgrade is postponed. Each
      >> > SQL change script assumes that the script before it executes without
      >> > failure. Scripts are numbered and must execute in order. If any
      >> > script fails, we rollback all changes and cancel the upgrade (you're
      >> > taking a backup of your database before you start, right?). This
      >> > makes it very simple to determine which scripts were applied, and
      >> > therefore you can refer to the "version" of your schema as the last
      >> > script that was applied.
      >> >
      >> > Keep in mind, your friendly DBA can test your SQL scripts on a copy of
      >> > the production database to test for problems like the foreign
      >> > key/invalid data described below.
      >> >
      >> > Tim Andersen
      >> >
      >> >
      >> > --- In agileDatabases@yahoogroups.com<agileDatabases%40yahoogroups.com>,
      >> "emanuelbulic" <emanuelbulic@> wrote:
      >> > >
      >> > >
      >> > >
      >> > > That's not what I meant; you're certainly right.
      >> > >
      >> > >
      >> > > I meant that should table A fail to upgrade from version 1.1 to 1.2
      >> (UPDATE statement fails due to a FK constraint violation) we can still
      >> upgrade table B from version 1.5 to version 1.6 (add column unrelated to
      >> table A).
      >> > >
      >> > > You are correct to assume that table A should not be further upgraded
      >> to version 1.3, since 1.1->1.2 failed.
      >> > >
      >> > > A more complex script would involve altering multiple tables. In such
      >> a scenario, a single script would increment the version number of each table
      >> affected.
      >> > >
      >> > > FKs are also an interesting artifact, since they require the source
      >> and target table as well as the PK of the target table (oracle requires a PK
      >> constraint for all FK references).
      >> > >
      >> > > I should be able to create all PKs in any order, and then create FKs
      >> also in any order.
      >> > >
      >> > > --- In agileDatabases@yahoogroups.com<agileDatabases%40yahoogroups.com>,
      >> Milen Kovachev <milenvk@> wrote:
      >> > > >
      >> > > > Having multiple versions of each object and running them in
      >> different order
      >> > > > depending on which one failed, or not, sounds like deviating from
      >> the
      >> > > > repeatability rule... Essentially, in the FK example you gave, the
      >> failed
      >> > > > script would be run last, once the problem with the FK gets fixed,
      >> which may
      >> > > > result in different state, compared to if the script was run when it
      >> was
      >> > > > supposed to be run.
      >> > > >
      >> > > > Milen
      >> > > >
      >> > > > On Wed, Aug 25, 2010 at 3:08 PM, emanuelbulic <emanuelbulic@>wrote:
      >> > > >
      >> > > > >
      >> > > > >
      >> > > > > I've used a similar solution in the past. It works but it got very
      >> messy.
      >> > > > > We had to dedicate one person to handle all scripts from all
      >> developer. No
      >> > > > > one wanted to do it, because it was such tedious brain-dead work.
      >> > > > > Additionally, it was all written in one gigantic t-sql file, which
      >> made
      >> > > > > things difficult.
      >> > > > >
      >> > > > > Anyway, most solutions I ran into are script centric. They track
      >> the
      >> > > > > successful execution of each script in order to avoid executing
      >> again at a
      >> > > > > later date. That, however, doesn't handle programmables very well.
      >> > > > >
      >> > > > > Regarding branching... I meant separate production release
      >> branches to
      >> > > > > support the various software branches in a project.
      >> > > > >
      >> > > > > So, anyway, here's what I put together... Java client using a
      >> > > > > database-backed repository (oracle, ms sql server, etc), tracks
      >> artifacts
      >> > > > > similarly to most source code control systems. So, each table,
      >> view, or
      >> > > > > function gets version, and each modification is tracked. Tables
      >> get
      >> > > > > alteration scripts, which are executed in sequence. Programmables
      >> get
      >> > > > > revised, yet only the last version is relevant ( they can be
      >> > > > > dropped/recreated). Also manages rename and drop of artifacts.
      >> > > > >
      >> > > > > An important distinction from other approaches is that alteration
      >> scripts
      >> > > > > do not get a specific sequencing; rather, the developer making the
      >> change,
      >> > > > > must supply a list of required artifact versions, which the
      >> software must
      >> > > > > assert prior to executing the script. This means that the upgrade
      >> process
      >> > > > > can skip an erroneous script and continue upgrading database
      >> objects which
      >> > > > > are unaffected by the problem. Once the problem has been
      >> addressed, the
      >> > > > > upgrade process can be re-executed.
      >> > > > >
      >> > > > > The drawback to this is that I can't assign a single version
      >> number to a
      >> > > > > database schema (subversion-like), since an upgrade can be
      >> partially applied
      >> > > > > while a problem is being addressed by developers/dba's. (perhaps a
      >> FK
      >> > > > > violation caused by bad data).
      >> > > > >
      >> > > > > The software does branching of artifacts, and allows me merge any
      >> script
      >> > > > > from branch to branch.. it effectively makes a copy, and links
      >> them together
      >> > > > > to avoid executing the same script in a future branch.
      >> > > > >
      >> > > > > The upgrade process persists the version of each artifact as it
      >> gets
      >> > > > > altered, and thus, the process is repeatable.
      >> > > > >
      >> > > > >
      >> > > > > --- In agileDatabases@yahoogroups.com<agileDatabases%40yahoogroups.com><agileDatabases%
      >> 40yahoogroups.com>,
      >> > > > > "timander37" <timander@> wrote:
      >> > > > > >
      >> > > > > > We have a solution for database schema versioning, though I'm
      >> not sure
      >> > > > > how well it supports branching. Is the branching for separate
      >> production
      >> > > > > databases or for different versions of the database in different
      >> > > > > environments along the development lifecycle (Dev, QA, Prod)?
      >> > > > > >
      >> > > > > > I refer to this as "poor man's migrations" because it is like
      >> how Rails
      >> > > > > does it, though we were using this solution before Rails
      >> Migrations. The
      >> > > > > idea of versioning the schema was inspired by both Martin Fowler's
      >> article
      >> > > > > http://martinfowler.com/articles/evodb.html and Scott Ambler's
      >> book "Agile
      >> > > > > Database Techniques."
      >> > > > > >
      >> > > > > > 1. Our build always executes a "nuke & pave" of the database.
      >> > > > > > 2. We have a tool to extract all of the data from a database to
      >> SQL
      >> > > > > insert scripts.
      >> > > > > > 3. We have a table called "schema_version" that keeps a list of
      >> all of
      >> > > > > the scripts that have been applied.
      >> > > > > > 4. Each script has an "insert into schema_version values
      >> > > > > ('name_of_script')"
      >> > > > > > 5. We have a JUnit test that verifies all of the SQL scripts in
      >> our
      >> > > > > "changes" directory have been accounted for in our schema_version
      >> table.
      >> > > > > >
      >> > > > > > The DBA's then have a tool-independent way to identify which
      >> changes have
      >> > > > > been applied to each database.
      >> > > > > >
      >> > > > > > We have also experimented with a database metadata test that
      >> creates a
      >> > > > > Hypersonic in-memory database from the SQL scripts and compares
      >> the metadata
      >> > > > > of the in-memory database with the real database. Keeping the SQL
      >> > > > > Hypersonic-compatible was a lot of overhead and we ended up taking
      >> this out
      >> > > > > because our schema version tracking worked well.
      >> > > > > >
      >> > > > > > I'd like to hear how other people are keeping track of their
      >> database
      >> > > > > version.
      >> > > > > > Tim Andersen
      >> > > > > >
      >> > > > > >
      >> > > > > >
      >> > > > > >
      >> > > > > > --- In agileDatabases@yahoogroups.com<agileDatabases%40yahoogroups.com><agileDatabases%
      >> 40yahoogroups.com>,
      >> > > > > "emanuelbulic" <emanuelbulic@> wrote:
      >> > > > > > >
      >> > > > > > > Hello all,
      >> > > > > > >
      >> > > > > > > I have been looking for a solid approach to database schema
      >> versioning
      >> > > > > for some time, and, ended up writing my own versioning tool. I
      >> wanted
      >> > > > > something similar to a source code repository. It must be version
      >> agnostic,
      >> > > > > and must have support for branching (multiple lineages). It should
      >> also
      >> > > > > treat table alterations and programmables appropriately.
      >> > > > > > >
      >> > > > > > > Any advice?
      >> > > > > > >
      >> > > > > >
      >> > > > >
      >> > > > >
      >> > > > >
      >> > > >
      >> > > >
      >> > > > [Non-text portions of this message have been removed]
      >> > > >
      >> > >
      >> >
      >>
      >>
      >>
      >
      >


      [Non-text portions of this message have been removed]
    • timander37
      I created a simple project that demonstrates how we track database changes. The source code can be found here:
      Message 41 of 41 , Sep 5 10:56 AM
      • 0 Attachment
        I created a simple project that demonstrates how we track database changes.

        The source code can be found here:
        http://github.com/timander/schema-versioning-example

        Please try it out and share your opinions/suggestions.

        Tim Andersen



        --- In agileDatabases@yahoogroups.com, "timander37" <timander@...> wrote:
        >
        > We have a solution for database schema versioning, though I'm not sure how well it supports branching. Is the branching for separate production databases or for different versions of the database in different environments along the development lifecycle (Dev, QA, Prod)?
        >
        > I refer to this as "poor man's migrations" because it is like how Rails does it, though we were using this solution before Rails Migrations. The idea of versioning the schema was inspired by both Martin Fowler's article http://martinfowler.com/articles/evodb.html and Scott Ambler's book "Agile Database Techniques."
        >
        > 1. Our build always executes a "nuke & pave" of the database.
        > 2. We have a tool to extract all of the data from a database to SQL insert scripts.
        > 3. We have a table called "schema_version" that keeps a list of all of the scripts that have been applied.
        > 4. Each script has an "insert into schema_version values ('name_of_script')"
        > 5. We have a JUnit test that verifies all of the SQL scripts in our "changes" directory have been accounted for in our schema_version table.
        >
        > The DBA's then have a tool-independent way to identify which changes have been applied to each database.
        >
        > We have also experimented with a database metadata test that creates a Hypersonic in-memory database from the SQL scripts and compares the metadata of the in-memory database with the real database. Keeping the SQL Hypersonic-compatible was a lot of overhead and we ended up taking this out because our schema version tracking worked well.
        >
        > I'd like to hear how other people are keeping track of their database version.
        > Tim Andersen
        >
        >
        >
        >
        > --- In agileDatabases@yahoogroups.com, "emanuelbulic" <emanuelbulic@> wrote:
        > >
        > > Hello all,
        > >
        > > I have been looking for a solid approach to database schema versioning for some time, and, ended up writing my own versioning tool. I wanted something similar to a source code repository. It must be version agnostic, and must have support for branching (multiple lineages). It should also treat table alterations and programmables appropriately.
        > >
        > > Any advice?
        > >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.