Loading ...
Sorry, an error occurred while loading the content.

Re: Script creation: Manual vs. Automated

Expand Messages
  • bretweinraub
    Slightly off topic .... can anyone tell me when and where sequenced and script-based DB migrations originated. Who started it and when? Just curious.
    Message 1 of 34 , Jun 2, 2009
    • 0 Attachment
      Slightly off topic .... can anyone tell me when and where sequenced and script-based DB migrations originated. Who started it and when?

      Just curious.

      --- In agileDatabases@yahoogroups.com, "milenvk" <milenvk@...> wrote:
      >
      > Hello,
      >
      > My experience with evolutionary database development based on incremental scripts is that it is better to actually run the scripts on production exactly as they evolved during development. Repeatability is one of the biggest benefits of having a database evolve using incremental scripts.
      >
      > Besides, what are you really going to lose by adding a field and then deleting it on production? The obvious answer is time, but you may actually lose more time generating the deltas from production. Other factors may be present, like fragmenting of the physical structures, etc. but these would most likely be easily solved using the tools provided to you by your database vendor.
      >
      > In a pure evolutionary database development team a DBA shouldn't be allowed to modify the database structure in an ad-hoc manner. Regarding database structure, involving a DBA in anything more than just approving the incremental scripts is risky, as it goes against the idea of repeatability.
      >
      > Anyway, if you are interested in tools that generate deltas, there are a number of offerings out there, including SQL Compare, DB Ghost (for SQL Server), Microsoft's DB Dude, etc.
      >
      > However, I believe the way of synchronizing the development and production databases that these tools promote, is exactly the ad-hoc way of deploying a database that leads to all the troubles that evolutionary DB development tries to avoid.
      >
      > Regards,
      > Milen Kovachev
      >
      > --- In agileDatabases@yahoogroups.com, "timander37" <timander@> wrote:
      > >
      > > Hello and welcome.
      > >
      > > Why are you making this assumption? If you add three columns and then later drop one, it doesn't matter. There is no need for DBA intervention to create an alternative migration. It is more work and more risk to maintain more than one database migration. You should always apply database changes with the same migration. By doing this, you don't need to worry about multiple environments because all database migrations are the same.
      > >
      > > What we do is a sort of "poor-man's migration." In each script, it inserts a row into a table called "SchemaVersion" so we can see what incremental version the database is currently at. We have a JUnit test to make sure we don't forget to add that insert to each script.
      > >
      > > Every so often, we "re-base" our baseline database to match what is currently in production. Our DBA extracts the DDL and we replace all of our incremental changes in development that have been applied to production. It basically squishes all of our changes that have made it to production into a new baseline script. (This eventually cleans up all of the "three column/two column" business.)
      > >
      > > Hope this helps.
      > > Tim Andersen
      > >
      > >
      > >
      > > --- In agileDatabases@yahoogroups.com, "tnabil76" <tarekmnabil@> wrote:
      > > >
      > > > Hi,
      > > >
      > > > I came to know about this group from Martin Fowler's article about "Evolutionary Database Design".
      > > >
      > > > I have been looking into this subject lately as I have found that it's become a pain point for us especially that we've been developing multiple features in parallel on the same system.
      > > >
      > > > I found the article above to be quite useful, although I'd have loved to hear more about the concept of "lineages" and how to manage them on multiple environments, i.e. SIT, UAT, etc.
      > > >
      > > > I have a question, though, about the database refactoring part and the idea of accumalating incremental scripts. Before reading the article, I had come to a conlcusion that this approach is far from being perfect. The reason is, if some incremental script is partially rolled back by a subsequent one, e.g. someone added three columns to a table then it was decided to remove one of them, then it becomes the responsibilty of the DBA to manually come up with the final incremental script which should actually be executed on the testing environment.
      > > >
      > > > I assume it's not a good idea when moving to production to actually add the column and then remove it, but rather to drop it from the original script altogether.
      > > >
      > > > I had thought that using a tool to do that would be better. So instead, use a tool which can store a version of the schema/database and then after the iteration/release is complete, ask the tool to generate the delta script.
      > > >
      > > > What do you think about this approach? Have you faced the issue above and how did you solve it?
      > > >
      > > > Your feedback is appreciated.
      > > >
      > > > Best regards.
      > > >
      > >
      >
    • jenswm84
      Hi, first of all, this is a interesting thread. I m doing my bachelor-thesis for such a scenario. I m actually trying to convince the core dev / dba team, that
      Message 34 of 34 , Jun 30, 2009
      • 0 Attachment
        Hi,

        first of all, this is a interesting thread. I'm doing my bachelor-thesis for such a scenario.

        I'm actually trying to convince the core dev / dba team, that the incremental versioning way has more benefits than drawbacks.

        The main problem in our discussions is, like you said in your first post, the execution time of the migration, if there are many not optimized changes. If you add 1 column to the same table in 3 different steps and then remove 1 column in another step, the "diff"-way is much faster, because the resulting changescript will do ONE add column 1 and 2 in one step, instead of doing 3 or 4 steps.

        So I think, in the world of schemas, the "diff"-way seems to be better for our BIG tables.

        But, ..., a migration is not only to change some schema-informations. You want to include complex convertfunctions, like converting the content of 1 column in the OLD schema to 2 columns in the NEW one. In the „diff"-world you have to do custom converts for each patch. It is a difference between the migration from 1.1 to 1.3 and 1.2 to 1.3. Theres no 100% repeatablity.

        At the moment I'm thinking about hybrid-ways, but I don't think there is one. Are there ways to optimize one of this two ways with the ideas of the other one?

        I have an idea for that. First I thought, it is too time-consuming to analyse the "changelog"-path for optimizations. But actually I think it could be possible. Maybe I could order the changes into a graph-like structure, where i can look at the changed objects from top to down. There I will find changes, which are potentially optimizable. It isn't too important to eliminate all unoptimized changetype. It would be a great benefit, if it takes care about the column problem and some others.

        What do you think?

        Your feedback is appreciated.

        Thank you all for this great thread with many interessting standpoints to those ways of migration.

        Best regards.

        Jens

        --- In agileDatabases@yahoogroups.com, "tnabil76" <tarekmnabil@...> wrote:
        >
        > Hi,
        >
        > Thank you all for your responses, I didn't really expect that much feedback.
        >
        > I must say I'm overwhelmed by the consensus on the incremental approach, which I had already completely dismissed. I have to clarify that we are currently using the incremental approach and that my research of alternate approaches is driven by what I felt were shortcomings of that approach.
        >
        > The thing is I believe there are cases which can not be dealt with in this fashion. For example, if there was a decision to drop a column and then this decision was canceled, there would be a script to drop the column and another one to restore it. If that is done incrementally, then how will one be able to restore the data in the restored column?
        >
        > I believe if I dig deep, I can come up with numerous other similar examples.
        >
        > Also, even if we assume that the incremental approach guarantees the correctness of the "roll forward" scripts, it doesn't do the same for the "roll back" scripts. If your rollback scripts are not perfect, then how can you revert back in case one of the changes you made caused problems?
        >
        > Another issue is how to version control the incremental scripts. What we've been doing is create sub-folders for every version which contains the incremental scripts for that particular version, e.g. from 0.10.1.1 to 0.10.1.2. I see some issues with this approach:
        > 1. It seems to be in direct contrast with most version control practices where versions should be expressed as tags, not subfolders of the branch.
        > 2. Another problem is at deployment time, you have to manually come up with a list of those scripts, which is error prone.
        >
        > I feel that this is going to be a long discussion thread, although hopefully, quite a useful one.
        >
        > Thank you again for your help.
        >
        >
      Your message has been successfully submitted and would be delivered to recipients shortly.