Loading ...
Sorry, an error occurred while loading the content.
 

Re: OT: OO data design and search performance

Expand Messages
  • Clinton Gormley
    ... Thanks Nick That s pretty much what I ve got - a table of objects, a table specifying which attributes each object type has, and a series of tables for
    Message 1 of 12 , Apr 11, 2003
      On Fri, 2003-04-11 at 13:22, Nick Tonkin wrote:
      As another poster has remarked, why not have your attributes in a separate
      table? So long as the ID linking them to the main object is not the primary
      key you should be fine.
      
      Objects:
      ID			Name
      1			foo
      2			bar
      
      Attributes:
      Object_ID	Attribute
      1			blue
      1			squishy
      1			expensive
      2			red
      2			cheap
      
      
      etc. Then to add a new attribute you just add a value to the table rather than
      a new column named after the attribute and yes/no values.
      
      Thanks Nick

      That's pretty much what I've got - a table of objects, a table specifying which attributes each object type has, and a series of tables for attributes of different types (eg lists, text, more complicated structures like address, images etc)

      The only reason I was thinking about denormalising was for performance reasons, but Aaron's reassurances have placated me.

      Clint
    • Martin Moss
      Here s my forpence:- Why Have database Tables Defining which Attributes belong to a Class. Why not Have the attribuites existing in a config section of your
      Message 2 of 12 , Apr 11, 2003
        
        Here's my forpence:-
         
        Why Have database Tables Defining which Attributes belong to a Class.
        Why not Have the attribuites existing in a config section of your class.
         
         
        e.g here's part of a baseclass.
         
        package My::BaseClass;
         
        sub _table_name
        {
          # this accessor is calling-package-relative
          my $obclass = shift;
          my $class   = ref($obclass) || $obclass;
          my $varname = $class . "::_table_name";
          no strict "refs";   # to access package data symbolically
          return $$varname;
        }
         
        sub _table_id
        {
          # this accessor is calling-package-relative
          my $obclass = shift;
          my $class   = ref($obclass) || $obclass;
          my $varname = $class . "::_table_id";
          no strict "refs";   # to access package data symbolically
          return $$varname;
        }
         
        sub _delete
        {
          my $self = shift;
         
          my $table    = $self->_table_name;
          my $table_id = $self->_table_id;
         
          my $stm = qq{DELETE FROM $table WHERE $table_id = ?};
         
          if ($self->{dbh}->do($stm, undef, $self->id))
          {
            return 1;
          }
          else
          {
            return undef;
          }
        }
         
         
        Then create instance classes which use My::BaseClass as a base but define their own configs and
        e.g.:-
        package My::Restaurant
         
        use base (My::BaseClass);
         
        @fields =qw(
                        RestaurantID
                        Name
                        Description
                        vegetarian
                      );
         
        %Required = (
                 Name                 => 'Name',
                 Description          => 'Description',
                 Vegetarian             => 'Is Vegetarian',
                );
         
        $_table_name    = 'Restaurants';
        $_table_id = 'RestaurantID';
         
         
         
        ___END___
         
        This would mean that you have seperate backend tables for each instance class. This would save on database Lookups, and give you the ability to override in the instance classes certain functions in the Base class. Indeed it would allow you to create specific SQL commands in an instance class which could do complex things like JOINS.  Ultimately JOINING tables in SQL is going to be more efficient than Joining data in perl.
         
         
         
        ----- Original Message -----
        Sent: Friday, April 11, 2003 12:55 PM
        Subject: Re: OT: OO data design and search performance

        On Fri, 2003-04-11 at 13:22, Nick Tonkin wrote:
        As another poster has remarked, why not have your attributes in a separate
        table? So long as the ID linking them to the main object is not the primary
        key you should be fine.
        
        Objects:
        ID			Name
        1			foo
        2			bar
        
        Attributes:
        Object_ID	Attribute
        1			blue
        1			squishy
        1			expensive
        2			red
        2			cheap
        
        
        etc. Then to add a new attribute you just add a value to the table rather than
        a new column named after the attribute and yes/no values.
        
        Thanks Nick

        That's pretty much what I've got - a table of objects, a table specifying which attributes each object type has, and a series of tables for attributes of different types (eg lists, text, more complicated structures like address, images etc)

        The only reason I was thinking about denormalising was for performance reasons, but Aaron's reassurances have placated me.

        Clint
      • Perrin Harkins
        ... Flattening essentially caches the work of finding all descendants, so naturally it will be faster. Given that you are already building a search index, I
        Message 3 of 12 , Apr 11, 2003
          On Fri, 2003-04-11 at 06:20, Clinton Gormley wrote:
          > What about this issue : Locations form the first 6 levels of my object
          > tree, linked with parent-child relationships, so
          > World->Continent->Country->Region->City->District. Then (eg) places to
          > stay are linked to Districts.
          >
          > You may want to find all Places to Stay in Syria - would it make sense
          > here to have a flattened hierarchy (purely for searching rather than as
          > the primary storage), or to still dynamically find all descendants of
          > Syria? (Especially given that this filter would be applied with every
          > search)

          Flattening essentially caches the work of finding all descendants, so
          naturally it will be faster. Given that you are already building a
          search index, I would go ahead and add this as well.

          - Perrin
        • Perrin Harkins
          ... The other suggestions offered here sound fine to me, but for a radically different take on indexing while keeping flexibility, see XML::Comma
          Message 4 of 12 , Apr 11, 2003
            On Fri, 2003-04-11 at 05:42, Clinton Gormley wrote:
            > It is clearly possible to build a search which ploughs through all of
            > the tables, but it would be much faster if I had one flat table for (eg)
            > places to stay with all of the attributes in different columns. My
            > original plan was to do just this - use a cron job to rebuild the flat
            > tables when any objects are changed. But this negates the flexibility of
            > the design ie instead of being able to add and remove attributes by
            > altering a single list, I now also need to change the structure of the
            > flat tables and the SQL which accesses them

            The other suggestions offered here sound fine to me, but for a radically
            different take on indexing while keeping flexibility, see XML::Comma
            (http://xml-comma.org/).

            - Perrin
          • Clinton Gormley
            ... This was my original plan, but with very few exceptions, the objects are exactly the same (bar the list of attributes associated with them). I wanted to do
            Message 5 of 12 , Apr 11, 2003
              On Fri, 2003-04-11 at 14:28, Martin Moss wrote:
              
              Here's my forpence:-
               
              Why Have database Tables Defining which Attributes belong to a Class.
              Why not Have the attribuites existing in a config section of your class.

               
               

              This was my original plan, but with very few exceptions, the objects are exactly the same (bar the list of attributes associated with them).

              I wanted to do :
              http://website.com/browse?object_id=123. I suppose I could have loaded the object type from the database, and then blessed that object into the relevant sub-class, but it's been easier to maintain everything through a single table.

              Having said that, there has been the odd "if type==6" lines...

              <OO tail firmly between legs>

              Clint

            • Martin Moss
              Don t sweat it, in my humble opinion the point of perl is to make life easy for oneself. I guess though that in my experience it s allways better to do it
              Message 6 of 12 , Apr 11, 2003
                
                Don't sweat it, in my humble opinion the point of perl is to make life easy for oneself. I guess though that in my experience it's allways better to do it right first time, coz in 4 months time something usually crops up which results in a 'oh $&$%&&&**'.  Like if somebody says right thats it, we're no longer going to support mysql, we're gonna use oracle instead:-) (always a favourite!)
                 
                One point of note, if all objects are so simillar, then surely, you can put your perl hat on to write a script which 'creates' all the code modules you need, you just feed it a Dummy Module template and a conf file containing the class specific stuff.
                 
                Anyway The OO approach would slow down the speed of things, and I guess it all depends on a balance between speed and efficiency against futureproofing and manageability.
                 
                Break a leg mate!
                 
                Marty
                 
                ----- Original Message -----
                Sent: Friday, April 11, 2003 3:11 PM
                Subject: Re: OT: OO data design and search performance

                On Fri, 2003-04-11 at 14:28, Martin Moss wrote:
                
                Here's my forpence:-
                 
                Why Have database Tables Defining which Attributes belong to a Class.
                Why not Have the attribuites existing in a config section of your class.

                 
                 

                This was my original plan, but with very few exceptions, the objects are exactly the same (bar the list of attributes associated with them).

                I wanted to do :
                http://website.com/browse?object_id=123. I suppose I could have loaded the object type from the database, and then blessed that object into the relevant sub-class, but it's been easier to maintain everything through a single table.

                Having said that, there has been the odd "if type==6" lines...

                <OO tail firmly between legs>

                Clint

              • Carl Holm
                I would second this approach. Nothing that you have described about your data modeling task would preclude the use of a fully normalized relational DB. Once
                Message 7 of 12 , Apr 11, 2003
                  I would second this approach. Nothing that you have described about
                  your data modeling task would preclude the
                  use of a fully normalized relational DB. Once you have this built,
                  performance can be improved through the creation of
                  indexes (where applicable) and by optimizing the SQL.

                  Regards,

                  Carl Holm

                  Aaron Trevena wrote:

                  > Clinton Gormley wrote:
                  >
                  >> [ snip]
                  >> So with an advanced search, you may want to specify a place to stay
                  >> which is budget, has air-conditioning and a vegetarian restaurant.
                  >>
                  >> It is clearly possible to build a search which ploughs through all of
                  >> the tables, but it would be much faster if I had one flat table for
                  >> (eg) places to stay with all of the attributes in different columns.
                  >> My original plan was to do just this - use a cron job to rebuild the
                  >> flat tables when any objects are changed. But this negates the
                  >> flexibility of the design ie instead of being able to add and remove
                  >> attributes by altering a single list, I now also need to change the
                  >> structure of the flat tables and the SQL which accesses them
                  >
                  >
                  >
                  > You do not need to denormalise - Your objects should have a many to
                  > many relationship with attributes, like being vegetarian. This many to
                  > many relationship maps to a lookup table containing unique records of
                  > attribute id and object id.
                  >
                  > This has many advantages - extra attributes are easy to add, you have
                  > a normalised data structure and you have a reverse index that requires
                  > only a very very simple join (join attribute table to lookup table and
                  > object table by attribute and lookup id respectively). This join will
                  > make good use of indexes for these tables and be pretty responsive.
                  > the data (or rather the relationship) is only stored in one place and
                  > therefore is always up to date and you don't have to worry about data
                  > being incorrect or out of date.
                  >
                  > You very rarely need to flatten tables - good normalisation means you
                  > can have a decent design and still have queries that perform at a
                  > decent speed.
                  >
                  > regards,
                  >
                  > A.
                  >
                  >
                  >
                Your message has been successfully submitted and would be delivered to recipients shortly.