Loading ...
Sorry, an error occurred while loading the content.

Re: [extremeperl] refactoring databases

Expand Messages
  • Johan Lindstrom
    ... This is just my experience at $work. But you should know that we re not very mature on process at all, and currently there is no way to guarantee that
    Message 1 of 11 , Feb 18, 2005
    • 0 Attachment
      At 18:37 2005-02-18, Rob Nagler wrote:
      > > In my experience, doing this automatically is kind of dangerous.
      >
      >Please elaborate why this type of automation is dangerous? We've had

      This is just my experience at $work. But you should know that we're not
      very mature on process at all, and currently there is no way to guarantee
      that there have been no fiddling. That's the current goal, to ensure we're
      at a known state before each upgrade.

      It's not quite best practice, but I guess a lot of shops are in a somewhat
      similar situation, especially if they have a tradition of strong operations
      departments with strong dba:s.


      > > Sometimes one can prepare the migration and just do the most critical part
      > > during the downtime.
      >
      >We try to do this, too.

      Do you automate these steps also?


      > > Of course this depends on the size of the shop and how well the sysadmins
      > > communicate with the developers and how well versed the developers are
      > with
      > > db-related concepts.
      >
      >This is why it is extremely important for developers to be dbas and
      >sysadmins, too. They have to know that problems can occur at all

      Agree 100%. It's a simple fact that all developers need to know their
      tools. And a db is a tool being used similar to a compiler. It's not a
      black box where there's... uh, data or something.


      >* Universally unique sequence ids. This is very useful for debugging,
      > and scalability.

      Does that mean you don't share identities between tables at all? Just one
      global sequence (or whatever)?


      /J

      -------- ------ ---- --- -- -- -- - - - - -
      Johan Lindström Sourcerer @ Boss Casinos johanl AT DarSerMan.com

      Latest bookmark: "TCP Connection Passing"
      http://tcpcp.sourceforge.net/
      dmoz: /Computers/Programming/Languages/JavaScript/ 12
    • Matthew Albright
      ... Yeah, we re on postgresql. ... We are actually shipping a software product, and we make the new releases available to our customers, but they don t have to
      Message 2 of 11 , Feb 18, 2005
      • 0 Attachment
        --- Rob Nagler <nagler@...> wrote:
        > This doesn't work with Oracle as mentioned by Johan below.

        Yeah, we're on postgresql.

        > This is simply not an option we offer our customers. We could do
        > this, I guess, but I don't believe in multiple versions in the field
        > running wild. At most two: test and prod.

        We are actually shipping a software product, and we make the new releases
        available to our customers, but they don't have to upgrade every time we
        release. We're an XP shop, so we release often, which means we have a wide
        variety of versions in the field.

        > This is why it is extremely important for developers to be dbas and
        > sysadmins, too.

        I couldn't agree more with this. When whoever it was started talking about
        DBA's changing the database structure, my eyes glazed over, because I've never
        been in an organization that had a DBA. We have people who are stronger and
        weaker in certain areas, but really, anyone can modify anything... that's what
        pair programming is for.

        > > If anyone is interested, I can share some additional thoughts on
        > > data migration landmines to watch out for...
        >
        > Please do.

        Like most relational database-backed applications, we've coded up a
        object-relational mapping layer that follows the "Active Record" pattern
        (http://www.martinfowler.com/eaaCatalog/activeRecord.html). The problems come
        in when you call too much of that mapping code from your update_db.pl script.
        At the time the update_db.pl code is run, it's running against the newest
        production code. If your migration code calls methods in the "real" code, that
        code may change, and your migration code might cease to work in the future.

        To avoid this rather subtle problem (which I suspect I've not explained very
        lucidly), we do 2 things:

        - Avoid the use of "real" code in the migration code, even if it means
        duplication
        - Test to make sure it is possible to upgrade from the oldest customer
        version to the new version

        Sometimes, the use of real code is unavoidable, and we've had to reorder things
        in the update_db.pl file... that's something to be avoided.

        matt
      • Adrian Howard
        On 19 Feb 2005, at 00:50, Matthew Albright wrote: [snip] ... [snip] ... Amen. ... Just to clarify - I was talking about a /client/ DBA who tweaked the schema
        Message 3 of 11 , Feb 19, 2005
        • 0 Attachment
          On 19 Feb 2005, at 00:50, Matthew Albright wrote:
          [snip]
          > --- Rob Nagler <nagler@...> wrote:
          [snip]
          >> This is why it is extremely important for developers to be dbas and
          >> sysadmins, too.
          >
          > I couldn't agree more with this.

          Amen.

          > When whoever it was started talking about
          > DBA's changing the database structure, my eyes glazed over, because
          > I've never
          > been in an organization that had a DBA.

          Just to clarify - I was talking about a /client/ DBA who tweaked the
          schema of software that we shipped to them - not the DBA of the
          development team.

          Not that we had anybody with that title.

          (although we did have an Oracle goddess who got paired with a heck of a
          lot when it came to the DB code :-)

          Yes this was (from one point of view) the clients "fault" - but if
          you're going to ship them automated scripts to update there database
          having an automated double-check that you're really updating the right
          schema is a nice thing.

          Adrian
        • Rob Nagler
          ... Yes, we try our best to automate everything. If we think there is going to be multiple upgrades, one of which is critical, we ll organize these in
          Message 4 of 11 , Feb 19, 2005
          • 0 Attachment
            Johan Lindstrom writes:
            > > > Sometimes one can prepare the migration and just do the most critical part
            > > > during the downtime.
            > >
            > >We try to do this, too.
            >
            > Do you automate these steps also?

            Yes, we try our best to automate everything. If we think there is
            going to be multiple upgrades, one of which is critical, we'll
            organize these in separate releases. If this isn't possible, we'll
            make it a two-part upgrade. That script I included in a previous
            message doesn't update the db_upgrade_t if you just call one of the
            methods.

            > Does that mean you don't share identities between tables at all? Just one
            > global sequence (or whatever)?

            Not exactly. The following describes it in detail:

            http://petshop.bivio.biz/src?s=Bivio::Type::PrimaryId

            This the base class of all serial keys. A PrimaryId is a NUMERIC(18)
            (but could be larger) which looks like:

            18 5 3 2 0
            +------------------+-------------+----------+-------+
            | Monotonically | Site Id or | Mistake | Type |
            | Increasing Seq | Other info | Bit | |
            +------------------+-------------+----------+-------+

            The numbers above are digit index. Type is a number that uniquely
            identifies the key type. This divides the name space. The Site Id
            allows you to divide the name space across independent databases. The
            Mistake Bit allows you to make mistakes. :-) We used it in one case
            to transform one key into another without changing the type It saved
            a lookup, and was related to finding the start of a tree for Oracle's
            CONNECT BY. Now they we are dumping CONNECT BY, we have no need for
            it. But that's another discussion.

            Here's how we initialize the sequences:

            CREATE sequence user_s
            MINVALUE 100001
            CACHE 1 INCREMENT BY 100000
            /

            CREATE SEQUENCE club_s
            MINVALUE 100002
            CACHE 1 INCREMENT BY 100000
            /

            CREATE SEQUENCE ec_payment_s
            MINVALUE 100015
            CACHE 1 INCREMENT BY 100000
            /

            CREATE sequence bulletin_s
            MINVALUE 100016
            CACHE 1 INCREMENT BY 100000
            /

            As you can see the spaces are unique, but overlapping. Some people
            might complain it junks up the indexes, but we haven't seen that to be
            a problem. Once the tables get big enough, it really doesn't matter.

            Rob
          • Rob Nagler
            ... Perhaps you should pay them to upgrade. :-) Yes, you ve got a logistical problem, and it sounds like you have a good solution for it. ... Interesting you
            Message 5 of 11 , Feb 19, 2005
            • 0 Attachment
              Matthew Albright writes:
              > We are actually shipping a software product, and we make the new releases
              > available to our customers, but they don't have to upgrade every time we
              > release. We're an XP shop, so we release often, which means we have a wide
              > variety of versions in the field.

              Perhaps you should pay them to upgrade. :-) Yes, you've got a
              logistical problem, and it sounds like you have a good solution for
              it.

              > been in an organization that had a DBA. We have people who are stronger and
              > weaker in certain areas, but really, anyone can modify anything... that's what
              > pair programming is for.

              Interesting you should mention that. I've been a bit turned off to
              "general pair programming" lately. We pair for the same reason. One
              of the things about pair programming that I don't like is that when
              people are alone, they learn from their mistakes at their own speed.
              If they are with another programmer, they not only get the problem
              solved more quickly, but they also don't necessarily assimilate the
              thought process and tools the other programmer brins to the table.
              Experience caused by thousands of wrong paths is a great guide.
              Finding the path right away isn't always the best way. It's certainly
              my experience with kids that they learn much better when they burn
              their hands than when you tell them the iron is hot.

              > Like most relational database-backed applications, we've coded up a
              > object-relational mapping layer that follows the "Active Record" pattern
              > (http://www.martinfowler.com/eaaCatalog/activeRecord.html). The problems come
              > in when you call too much of that mapping code from your update_db.pl script.
              > At the time the update_db.pl code is run, it's running against the newest
              > production code. If your migration code calls methods in the "real" code, that
              > code may change, and your migration code might cease to work in the future.

              Excellent point. We see this sometimes when trying to do a current
              upgrade. I would hate to solve this problem for N versions in the
              past.

              > To avoid this rather subtle problem (which I suspect I've not explained very
              > lucidly), we do 2 things:
              >
              > - Avoid the use of "real" code in the migration code, even if it means
              > duplication
              > - Test to make sure it is possible to upgrade from the oldest customer
              > version to the new version
              >
              > Sometimes, the use of real code is unavoidable, and we've had to reorder things
              > in the update_db.pl file... that's something to be avoided.

              Great solution. Seems like you have to do a lot of feature-based
              tests. For example, you might add a feature, fix it, then remove it
              (because it wasn't the right solution in the first place). Those
              three upgrades are a no-op after the "remove it" upgrade.

              One issue I don't see how you get around is the collection of new
              types of data. You can never assume that a new data type of data
              exists in the database. This means you probably can't retire
              "deprecated" code that works around the non-existence of the new type
              of data. This may create lots and lots of paths to test. Ugh...

              Rob
            • Rob Nagler
              ... Yikes. Remember there s a clear separation of concerns between the customer and the programmer. Be sure to yell at your customer. ;-) ... Yup. I wonder
              Message 6 of 11 , Feb 19, 2005
              • 0 Attachment
                Adrian Howard writes:
                > Just to clarify - I was talking about a /client/ DBA who tweaked the
                > schema of software that we shipped to them - not the DBA of the
                > development team.

                Yikes. Remember there's a clear separation of concerns between the
                customer and the programmer. Be sure to yell at your customer. ;-)

                > Yes this was (from one point of view) the clients "fault" - but if
                > you're going to ship them automated scripts to update there database
                > having an automated double-check that you're really updating the right
                > schema is a nice thing.

                Yup. I wonder though if you don't also have to protect the data?
                Usually apps trust the data that comes from the database. In your
                case, that wouldn't be the case. Do you have extra validation to make
                sure the data hasn't been tampered with?

                Rob
              Your message has been successfully submitted and would be delivered to recipients shortly.