Loading ...
Sorry, an error occurred while loading the content.
 

Re: Flexible DB Designs

Expand Messages
  • Dawn Wolthuis <dwolt@tincat-group.com>
    There is a database environment that has been around since the 60 s and is still going strong, but is rarely spoken of -- it flies below the radar of the
    Message 1 of 13 , Dec 28, 2002
      There is a database environment that has been around since the 60's
      and is still going strong, but is rarely spoken of -- it flies below
      the radar of the computer industry press, for the most part. [In
      fact, I might be breaking the rules by mentioning it in writing ;-) ]

      The MultiValue database environment, aka "Pick" looks very much like
      XML documents, except that the delimiters are funky characters
      rather than metadata. This type of database, often referred to
      as "legacy" is amazingly agile and has stayed around throughout the
      relational database revolution (often referring to itself as
      relational in order to be with the "in" crowd) because developers
      love it for its flexibility.

      A "Pick" database has all of its elements as variable length, so
      there is no required change to the database if there is a need to
      change the length of a field. I tried asking a relational database
      evangelist why RDBMS's require specifying the size of a field --
      what makes size such an important constraint for stored data? He
      tried to defend it, but really -- it isn't!! It was easy for the
      RDBMS's to be told what length a field should be and constrain the
      developer in that way, suggesting this was also leading to better
      stored data, but it was done for the sake of the database, not the
      application developer.

      Also, and here is where the RDBMS folks often get really huffy,
      there is no fixed constraint on the cardinality of an element in a
      MultiValue database either. A person used to have at most one phone
      number or one car and now they can have many? No need to alter the
      database in a big way, just add a redefinition of the element, or
      change the existing definition to say that there is now a list in
      the element. Yes, that means the data is not in first normal form --
      so what? 1NF was developed simply to simplify the mathematics of
      the resulting relations and it turns out that making the relations
      more complex -- permitting elements that are themselves relations,
      for example (or other objects for that matter), makes the database
      more agile. This also results in implied constraints that are
      important, but typically left out of RDBMS's -- constraints that
      identify a child table in a hierarchical relationship with a parent
      table (hierarchies are permitted now with SQL-99 but this is irks
      many RDMBS folks, it seems).

      There are many other reasons to argue that a MultiValue database,
      such as U2 -- Universe or UniData from IBM, D3 from Raining Data,
      Revelation from Revelation, jBASE from jBASE, and several others are
      much better for use with agile development approaches than RDBMS's
      that insist on coding a bunch of constraints on the data that often
      later need to be reworked at a significant cost.

      There just might be a reason that the Pick environment is a legacy
      and "legacy" might not be a bad word in many situations -- at least
      that is my two cents. --dawn

      Dawn M. Wolthuis
      www.tincat-group.com

      --- In agileDatabases@yahoogroups.com, "mlovell98133" <m98133@h...>
      wrote:
      > I am interested in thoughts on building a flexible database schema
      so
      > that the database does not need to be restructured as the
      > applications which use the database grow and develop.
      >
      > For instance, using property tables (with rows for a key, property
      > name, and property value), rather than using a separate column for
      > each property. This way as applications require more attributes
      for
      > entities, they can just add them to the database without
      > restructuring. It would also make the database backward compatible.
      >
      > What other techniques can be used to make a database flexible?
      >
      > Is it worth setting things up flexibly, or is it better to just
      keep
      > rebuilding the database with new schemas? How do performance
      > requirements effect the decision? What else effects the decision?
      >
      > Thanks
    • jhrothjr <yahoogroups@jhrothjr.com>
      ... [delete paen to Pick - not that I ve got anything against it...] The trouble with modern data base technology is that it was developed before people
      Message 2 of 13 , Dec 29, 2002
        --- In agileDatabases@yahoogroups.com, "Dawn Wolthuis <dwolt@t...>" <dwolt@t...> wrote:

        [delete paen to Pick - not that I've got anything against it...]

        The trouble with modern data base technology is that it was developed before people discovered that coupling was a sin. The philosophy of the corporate data base is a prescription for a system that is very fragile and hard to modify. The more applications that share data, the higher the degree of coupling, and the harder the resulting system is to change.

        This isn't just a problem with data bases, either. One shop I worked in had a universal flat file record format that went through everything they did. When they finally had to change it, they had to change essentially every program in the shop. It took them five years, and I left before they were in a position to make the changes to that record that really did need to be made.

        Unfortunately, we're stuck with the suckers, because they are now "the way it is." Relational data base technology is being taught in the business schools as good. Yuck!

        John Roth


        > --- In agileDatabases@yahoogroups.com, "mlovell98133" <m98133@h...>
        > wrote:
        > > I am interested in thoughts on building a flexible database schema
        > so
        > > that the database does not need to be restructured as the
        > > applications which use the database grow and develop.
        > >
        > > For instance, using property tables (with rows for a key, property
        > > name, and property value), rather than using a separate column for
        > > each property. This way as applications require more attributes
        > for
        > > entities, they can just add them to the database without
        > > restructuring. It would also make the database backward compatible.
        > >
        > > What other techniques can be used to make a database flexible?
        > >
        > > Is it worth setting things up flexibly, or is it better to just
        > keep
        > > rebuilding the database with new schemas? How do performance
        > > requirements effect the decision? What else effects the decision?
        > >
        > > Thanks
      • Scott W. Ambler
        ... From: To: Sent: Sunday, December 29, 2002 8:49 AM Subject: [agileDatabases] Re: Flexible DB
        Message 3 of 13 , Dec 29, 2002
          ----- Original Message -----
          From: <yahoogroups@...>
          To: <agileDatabases@yahoogroups.com>
          Sent: Sunday, December 29, 2002 8:49 AM
          Subject: [agileDatabases] Re: Flexible DB Designs


          > --- In agileDatabases@yahoogroups.com, "Dawn Wolthuis <dwolt@t...>"
          <dwolt@t...> wrote:
          >
          > [delete paen to Pick - not that I've got anything against it...]
          >
          > The trouble with modern data base technology is that it was developed
          before people discovered that coupling was a sin. The philosophy of the
          corporate data base is a prescription for a system that is very fragile and
          hard to modify. The more applications that share data, the higher the degree
          of coupling, and the harder the resulting system is to change.

          Exactly! At www.agiledata.org/essays/relationalDatabases.html#Coupling I
          argue that coupling is your greatest enemy, one that has been relatively
          ignored within the data community (not to mention the concept of
          encapsulation, but don't get me going).

          In that article I also go on to list many of the common ways that a database
          can be involved in coupling.


          >
          > This isn't just a problem with data bases, either. One shop I worked in
          had a universal flat file record format that went through everything they
          did. When they finally had to change it, they had to change essentially
          every program in the shop. It took them five years, and I left before they
          were in a position to make the changes to that record that really did need
          to be made.

          Yes. Coupling is coupling is coupling. The greater the coupling, the
          harder it is to change something.

          >
          > Unfortunately, we're stuck with the suckers, because they are now "the way
          it is." Relational data base technology is being taught in the business
          schools as good. Yuck!

          Yes. RDBs are the primary technology out there and will likely remain so
          for quite some time. Sigh.

          > > --- In agileDatabases@yahoogroups.com, "mlovell98133" <m98133@h...>
          > > wrote:
          > > > I am interested in thoughts on building a flexible database schema
          > > so
          > > > that the database does not need to be restructured as the
          > > > applications which use the database grow and develop.

          You're going to need to change your database schema over time for any
          reasonable sized application. The problem with "generic schemas" is that
          they perform poorly once they grow. However, if you have a small data set
          then they work incredibly well.

          If you're looking for great performance, check out www.prevayler.org.

          If you find that you need to change your schema, check out
          www.agiledata.org/essays/databaseRefactoring.html



          > > >
          > > > For instance, using property tables (with rows for a key, property
          > > > name, and property value), rather than using a separate column for
          > > > each property. This way as applications require more attributes
          > > for
          > > > entities, they can just add them to the database without
          > > > restructuring. It would also make the database backward compatible.
          > > >
          > > > What other techniques can be used to make a database flexible?
          > > >
          > > > Is it worth setting things up flexibly, or is it better to just
          > > keep
          > > > rebuilding the database with new schemas? How do performance
          > > > requirements effect the decision? What else effects the decision?

          Performance is a major issue.

          Your ability to change the schema easily is also an issue.

          Software development always involves trade offs, which is one of the things
          that makes it fun IMHO.

          - Scott
          =====================================================
          Scott W. Ambler scott.ambler@...
          Senior Consultant, Ronin International, Inc.
          www.ronin-intl.com/company/scottAmbler.html

          The Elements of UML Style is out! www.ambysoft.com/elementsUMLStyle.html

          www.agiledata.org * www.agilemodeling.com * www.ambysoft.com *
          www.enterpriseunifiedprocess.info * www.modelingstyle.info *
          www.ronin-intl.com
        • Dawn M. Wolthuis
          Yes, that is a good way to state it -- that the tight coupling inherent in RDBMS applications based on what is deemed to be excellent relational database
          Message 4 of 13 , Dec 29, 2002
            Yes, that is a good way to state it -- that the tight coupling inherent in
            RDBMS applications based on what is deemed to be excellent relational
            database theory contributes to rigid software.

            Database constraints, while seemingly a good idea for ensuring quality data
            structures, are implemented in a way that they are so tightly coupled with
            the data storage that change is very difficult. Some of the chief culprits
            of such constraints are found in even the most primitive of data filing
            systems -- the flat file, where size and cardinality of values are
            constrained just as in a relational database.

            So as not to sound like a Pick/MultiValue evangelist, I'll use XML documents
            as an example of a means of storing data where the cost of changing sizes of
            elements or cardinality is not so high (I suspect). I don't have a lot of
            hands-on experience with XML document-based data storage but perhaps when
            there is a need to make changes to the type of data stored in an element or
            attribute, the schema document could be modified, or a new one added,
            without a need for a lot of other changes to the existing code (?)

            Size of field is just one pain-in-the-neck constraint that bites us when
            changes are needed, but I think it is one worth zeroing in on. With current
            user interfaces typically having fixed length fields for data entry, there
            are still changes required there (so I think it makes more sense not to
            limit the size of data that can be entered into a "screen"). If the UI and
            the database can both change handily under size or cardinality conditions,
            that would go a long way to getting RDBMS applications as agile as
            Pick/MultiValue.

            Having worked with agile software development processes and with both
            RDBMS's and MultiValue databases, there is really a huge difference (HUGE!)
            in the ability to start a project with less of the design completed and be
            able to modify the application handily over time with MultiValue (there was
            no big Y2K concern, for example). There must be other models that also
            address this need -- anyone else have such experiences? --dawn

            Dawn M. Wolthuis
            Tincat Group, Inc.
            www.tincat-group.com

            Take and give some delight today.


            -----Original Message-----
            From: jhrothjr <yahoogroups@...> [mailto:yahoogroups@...]
            Sent: Sunday, December 29, 2002 8:49 AM
            To: agileDatabases@yahoogroups.com
            Subject: [agileDatabases] Re: Flexible DB Designs

            --- In agileDatabases@yahoogroups.com, "Dawn Wolthuis <dwolt@t...>"
            <dwolt@t...> wrote:

            [delete paen to Pick - not that I've got anything against it...]

            The trouble with modern data base technology is that it was developed before
            people discovered that coupling was a sin. The philosophy of the corporate
            data base is a prescription for a system that is very fragile and hard to
            modify. The more applications that share data, the higher the degree of
            coupling, and the harder the resulting system is to change.

            This isn't just a problem with data bases, either. One shop I worked in had
            a universal flat file record format that went through everything they did.
            When they finally had to change it, they had to change essentially every
            program in the shop. It took them five years, and I left before they were in
            a position to make the changes to that record that really did need to be
            made.

            Unfortunately, we're stuck with the suckers, because they are now "the way
            it is." Relational data base technology is being taught in the business
            schools as good. Yuck!

            John Roth
          • patrickdlogan@attbi.com
            ... Flexible structures is part of the coupling issue. Another part is the language used to access and update the data. XML in general is a structuring
            Message 5 of 13 , Dec 30, 2002
              > So as not to sound like a Pick/MultiValue evangelist, I'll use XML
              > documents as an example of a means of storing data where the cost of
              > changing sizes of elements or cardinality is not so high (I
              > suspect). I don't have a lot of hands-on experience with XML
              > document-based data storage but perhaps when there is a need to make
              > changes to the type of data stored in an element or attribute, the
              > schema document could be modified, or a new one added, without a
              > need for a lot of other changes to the existing code (?)

              Flexible structures is part of the coupling issue. Another part is the
              language used to access and update the data. XML in general is a
              structuring language that does not in itself say anything about
              efficient storage mechanisms for access or update, nor does XML in and
              of itself say anything about expressive search or update behavior.

              A language like Lorel...

              http://citeseer.nj.nec.com/abiteboul97lorel.html

              ...is a good example of the ability to "say less" about data in order
              to get to just what is desired. But the other end of the spectrum is
              the results become more ambiguous, i.e. including undesirable data or
              excluding desirable data.

              O/R mapping, object databases, deductive databases, and other
              techniques are also attempts to be more expressive about the behavior
              associated with persistent data without coupling too closely to the
              representation of the data.

              Relational tables are actually more general than hierarchical or
              network databases. The problem really is the languages that are used
              to access and update them. SQL is just too limited. I blame SQL more
              than the table concept per se.

              -Patrick
            • jhrothjr <yahoogroups@jhrothjr.com>
              ... Not exactly. Sharing data among different application is the essence of the coupling issue. If an application s data is completely private then there is no
              Message 6 of 13 , Dec 31, 2002
                --- In agileDatabases@yahoogroups.com, patrickdlogan@a... wrote:
                > > So as not to sound like a Pick/MultiValue evangelist, I'll use XML
                > > documents as an example of a means of storing data where the cost of
                > > changing sizes of elements or cardinality is not so high (I
                > > suspect). I don't have a lot of hands-on experience with XML
                > > document-based data storage but perhaps when there is a need to make
                > > changes to the type of data stored in an element or attribute, the
                > > schema document could be modified, or a new one added, without a
                > > need for a lot of other changes to the existing code (?)
                >
                > Flexible structures is part of the coupling issue. Another part is the
                > language used to access and update the data. XML in general is a
                > structuring language that does not in itself say anything about
                > efficient storage mechanisms for access or update, nor does XML in and
                > of itself say anything about expressive search or update behavior.

                Not exactly. Sharing data among different application is the essence of the coupling issue. If an application's data is completely private then there is no inter-application coupling. That is, if other applications need to make a request to the application, not to the data base server, then you've eliminated the coupling.

                Language mismatch issues are examples of "impedance" mismatches. (I don't like the word, but I don't know of a better one.) The least mismatch is caused by a strategy that simply keeps all the data in objects in the application's process. That's simply not possible in the general case, but where it is, it works very well. (See Prevalyar for an example of how to do interesting things with it.)

                Anything else causes a problem. The original hierarchical and network data bases and the not quite as old relational data bases are outgrowths of systems from the '60s and '70s, and have all of those assumptions built in.

                John Roth
                >
                > A language like Lorel...
                >
                > http://citeseer.nj.nec.com/abiteboul97lorel.html
                >
                > ...is a good example of the ability to "say less" about data in order
                > to get to just what is desired. But the other end of the spectrum is
                > the results become more ambiguous, i.e. including undesirable data or
                > excluding desirable data.
                >
                > O/R mapping, object databases, deductive databases, and other
                > techniques are also attempts to be more expressive about the behavior
                > associated with persistent data without coupling too closely to the
                > representation of the data.
                >
                > Relational tables are actually more general than hierarchical or
                > network databases. The problem really is the languages that are used
                > to access and update them. SQL is just too limited. I blame SQL more
                > than the table concept per se.
                >
                > -Patrick
              • drewivan <drewivan@yahoo.com>
                I have used a variation of they key-value idea successfully in the past. We had a keywords facility that end users could use to mark a set of records. (Then
                Message 7 of 13 , Jan 6, 2003
                  I have used a variation of they key-value idea successfully in the past.

                  We had a "keywords" facility that end users could use to mark a set of
                  records. (Then they'd do a "keywords search" to bring back all the
                  records marked with a given keyword.) We extended the facility to have
                  a special set of "system keywords", which were interpreted specially
                  by the program to accomplish flexible data tasks.

                  Our keywords weren't as sophistocated as key-value pairs though.
                  Basically, they denoted inclusion in a set. So you could, for example,
                  tag a bunch of people in the database as being on your "call today"
                  keyword. The system would just store a pair of integers -- the keyword
                  ID and the tagged record ID.

                  We were sometimes able to use this instead of adding boolean columns
                  to certain tables, which allowed us to extend the schema while
                  maintaining backward comatability. On one hand, it did make it a
                  little tougher to do queries, because you had to know about the
                  existence of this keyword in order to write your query. On the other
                  hand, since so many things were handled in the same way, it actually
                  made some queries a little easier.

                  This also made it easier for end users to write their own queries,
                  because they, too, had access to these system keywords. That's
                  something that may have been a little harder to flexibly implement if
                  all the data were stored in columns.
                • johnc650
                  Don t forget to that today all the data modification is through the APP but tommorrow your app is sold to someone who has their own XYZ module and wants to
                  Message 8 of 13 , Nov 4, 2005
                    Don't forget to that today all the data modification is through
                    the "APP" but tommorrow your app is sold to someone who has their
                    own XYZ module and wants to integrate directly into the database.

                    Hey you say how did the gender field get a "Y" in these records and
                    it's all downhill from there!

                    -jfc-


                    --- In agileDatabases@yahoogroups.com, "Bayley, Alistair"
                    <alistair_bayley@l...> wrote:
                    >
                    > I think that you're going down a slippery slope if you replace your
                    > relational design (with integrity constraints) with a one based on
                    key-value
                    > pairs. Unless you are prepared to write a lot of code to enfore
                    the business
                    > rules, and thus probably end-up writing a rule/constraint engine
                    (and
                    > testing and maintaining it), you will lose the ability to enforce
                    data
                    > integrity. And if you allow all sorts of crap into your database,
                    then you
                    > *get* all sorts of crap in your database.
                    >
                    > I think you're better off retaining a good relational design (with
                    all of
                    > the protection that should give your data), and writing code to
                    migrate your
                    > data when the structures change. This *will* take less of your
                    time then
                    > what you waste trying to fix problems caused by incorrect data in
                    the
                    > database.
                    >
                    > You may have a specific business requirement for key-value pairs
                    (for
                    > example, the users must be able to create their own
                    descriptions/fields to
                    > attach to objects), but I think you should restrict that to just
                    the places
                    > where it is required.
                    >
                    > Key-value pairs may make the database backwards compatible and
                    flexible, but
                    > if your application has no idea how to handle some of the data
                    then you will
                    > have problems. Again, you will also have to build data integrity
                    rules into
                    > your application (and test them) rather than have them in the
                    database (they
                    > should really be in both but if you don't have them in the
                    database them
                    > you're asking for trouble).
                    >
                    >
                    > -----Original Message-----
                    > From: mlovell98133 [mailto:m98133@h...]
                    > Sent: 21 November 2002 01:07
                    > To: agileDatabases@yahoogroups.com
                    > Subject: [agileDatabases] Flexible DB Designs
                    >
                    >
                    > I am interested in thoughts on building a flexible database schema
                    so
                    > that the database does not need to be restructured as the
                    > applications which use the database grow and develop.
                    >
                    > For instance, using property tables (with rows for a key, property
                    > name, and property value), rather than using a separate column for
                    > each property. This way as applications require more attributes
                    for
                    > entities, they can just add them to the database without
                    > restructuring. It would also make the database backward compatible.
                    >
                    > What other techniques can be used to make a database flexible?
                    >
                    > Is it worth setting things up flexibly, or is it better to just
                    keep
                    > rebuilding the database with new schemas? How do performance
                    > requirements effect the decision? What else effects the decision?
                    >
                    > Thanks
                    >
                    >
                    > *****************************************************************
                    > The information in this email and in any attachments is
                    > confidential and intended solely for the attention and use
                    > of the named addressee(s). This information may be
                    > subject to legal professional or other privilege or may
                    > otherwise be protected by work product immunity or other
                    > legal rules. It must not be disclosed to any person without
                    > our authority.
                    >
                    > If you are not the intended recipient, or a person
                    > responsible for delivering it to the intended recipient, you
                    > are not authorised to and must not disclose, copy,
                    > distribute, or retain this message or any part of it.
                    > *****************************************************************
                    >
                  • Scott W. Ambler
                    ... Yes, which might be why you want a test suite for your database. Got one of those? Or, do you just have a bunch of models which the developers of XYZ will
                    Message 9 of 13 , Nov 4, 2005
                      At 01:14 PM 11/4/2005, you wrote:
                      >Don't forget to that today all the data modification is through
                      >the "APP" but tommorrow your app is sold to someone who has their
                      >own XYZ module and wants to integrate directly into the database.
                      >
                      >Hey you say how did the gender field get a "Y" in these records and
                      >it's all downhill from there!


                      Yes, which might be why you want a test suite for your database. Got
                      one of those?

                      Or, do you just have a bunch of models which the developers of XYZ
                      will likely never look at because they find the data group too slow
                      to respond to their requests for help?

                      - Scott

                      ====================================================
                      Scott W. Ambler
                      Senior Consultant, Ambysoft Inc.
                      www.ambysoft.com/scottAmbler.html

                      www.agiledata.org -:- www.agilemodeling.com -:- www.ambysoft.com
                      -:- www.databaserefactoring.com -:- www.enterpriseunifiedprocess.com
                    Your message has been successfully submitted and would be delivered to recipients shortly.