Loading ...
Sorry, an error occurred while loading the content.

Re: [xml-dbms] Did I miss something?

Expand Messages
  • Phil Friedman
    Volunteers? Yes, I m looking into it. My first pass may be just a flat map with no join information. Regards, Philip Friedman -- Terralink Software Systems --
    Message 1 of 5 , May 3, 2000
    • 0 Attachment
      Volunteers? Yes, I'm looking into it. My first pass may be just a flat
      map with no join information.

      Regards, Philip Friedman -- Terralink Software Systems -- 207-772-6500 x101

      On Wed, 03 May 2000 13:35:45 CEST, "Ronald Bourret" <rpbourret@...> may have written:

      |There's no reason this can't be done. In fact, it was planned for the 1.0
      |release -- I simply ran out of time to implement it.
      [snip]
    • Phil Friedman
      Ron, I m back to thinking about creating maps from databases, This is a learning process rather than a commitment to use xmldbms, but can you help me a bit
      Message 2 of 5 , May 26, 2000
      • 0 Attachment
        Ron,

        I'm back to thinking about creating maps from databases, This is a
        learning process rather than a commitment to use xmldbms, but
        can you help me a bit more? Can you point to the code in MapFactory_DTD
        or MapFactory_MapDocument that I could start with?

        Here is my understanding: For step 1, loop or traverse the tree of
        database tables to map with JDBC meta data. Would it be correct to think
        of this loop/traverse as the SAX parsing of a map document?

        At "startColumn", construct a Column and ColumnMap.
        At "endTable", construct a Table and TableMap containing the Columns
        and ColumnMaps.

        Check out http://www.javaworld.com/javaworld/jw-01-2000/jw-01-dbxml_p.html
        for the source of this idea.

        For step two, I'm clueless.

        But, why not just write the map XML straight from the database-SAX
        events?

        Regards, Philip Friedman -- Terralink Software Systems -- 207-772-6500
        x101

        On Wed, 03 May 2000 13:35:45 CEST, "Ronald Bourret" <rpbourret@...> may have written:

        |There's no reason this can't be done. In fact, it was planned for the 1.0
        |release -- I simply ran out of time to implement it.
        |
        |The basic algorithm is that you would give the map factory a root table name
        |(or names, if PseudoRoot elements are to be used in the map) and the map
        |factory would generate a map as follows:
        |
        |Tables are mapped as objects (elements). Table names are used as element
        |names.
        |
        |Columns are mapped as properties (PCDATA-only elements or attributes,
        |according to user choice). Column names are used as element/attribute names.
        |(Note that the factory would need to check for collisions between table and
        |column names if PCDATA-only elements are used to store properties.)
        |
        |Primary key / foreign key relationships are followed to find related tables
        |(objects). (I need to think about it a bit, but these relationships might be
        |followed for the root table only when the primary key is in the root table.)
        |
        |This can all be done with the metadata facilities of JDBC. Note that some
        |drivers don't support the discovery of primary key / foreign key
        |relationships, which would obviously limit the generated map.
        |
        |Writing this code involves two steps:
        |
        |1) Writing the code to create the table-oriented map from the database
        |metadata. (MapFactory_DBMS)
        |
        |2) Writing the code to create the element-oriented map from the
        |table-oriented map. (Modifying TempMap.)
        |
        |To understand what I mean here, you need to know that a map can be viewed
        |from a element perspective (that is, how are elements/attributes mapped to
        |tables/columns) and from a table perspective (that is, how are
        |tables/columns mapped to elements/attributes). The mapping language takes an
        |element perspective.
        |
        |Internally, the various map classes provide both perspectives: the element
        |perspective is used by DOMToDBMS, the table perspective is used by
        |DBMSToDOM. The two existing map factories (MapFactory_MapDocument and
        |MapFactory_DTD) both generate the element perspective classes and then let
        |TempMap "invert" these classes to create the table perspective classes. The
        |reverse "inversion" code (table perspective to element perspective) has not
        |yet been written and is therefore listed above as item (2).
        |
        |I might find time to write this code this summer, but I give no guarantees.
        |In the mean time, any volunteers?
        |
        |-- Ron Bourret
        |
        |>Well, I Hacked a map by hand, not too difficult to start but I have the
        |>rest of a very large database to map. Is there some reason we can't
        |>generate a default map and DTD from a database?
        |>
        |>Regards, Philip Friedman -- Terralink Software Systems -- 207-772-6500
        |>x101
        |>
        |>| Is there a way to get a simple (same structure as the database?) mapping
        |>| starting with an existing database? It looks like GenerateMap requires
        |>| a DTD, then generates the database schema.
        |>|
        |>| I have the database: source SQL, or MS SQL Server or Sybase Adaptive
        |>| Server Anywhere databases. For my initial purposes an automatically
        |>| generated map would be fine. It would also be a good starting point for
        |>| a more interesting map.
        |
        |________________________________________________________________________
        |Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
        |
        |
        |------------------------------------------------------------------------
        |eLerts
        |It's Easy. It's Fun. Best of All, it's Free!
        |http://click.egroups.com/1/3864/0/_/735786/_/957353747/
        |------------------------------------------------------------------------
        |
        |To Post a message, send it to: xml-dbms@...
        |
        |To Unsubscribe, send a blank message to: xml-dbms-unsubscribe@...
      • Ronald Bourret
        This message is rather long, so please bear with me -- you really got me thinking... Sounds like you ve got the basic idea -- when you hit table metadata,
        Message 3 of 5 , May 26, 2000
        • 0 Attachment
          This message is rather long, so please bear with me -- you really got me
          thinking...

          Sounds like you've got the basic idea -- when you hit table metadata, create
          a Table and TableMap, when you hit column metadata, create a Column and
          ColumnMap. It sounds like you are only missing a few things:

          1) As you follow foreign key relationships, add this information to the
          relatedTables, parentKeyIsCandidate, etc. fields of TableMap. That is, these
          five fields store information about how to link to the child tables of a
          given table.

          2) You actually create TempTable, TempTableMap, etc. objects rather than
          Table, TableMap, etc. objects. The Temp objects are essentially the same as
          the non-Temp objects except that they use Vectors instead of arrays. (I used
          Vectors in the Temp objects because they have no length limits; this is
          useful when you don't know, for example, how many child tables a given table
          has. I used arrays in the non-Temp objects because they are much faster to
          access. This was based on the assumption that a given Map object would be
          used multiple times and that the extra creation/compile time was therefore
          worthwhile.)

          3) The map factory should accept:

          a) One table name and an optional root element name, or
          b) Multiple table names and a required root element name.

          You need to create a TempRootTableMap for each table name the user passes.
          If the user passes a root element name, this corresponds to the IgnoreRoot
          element in the mapping language and is required if there are multiple root
          tables (which correspond to multiple pseudo-root elements). The root element
          name (if passed) is stored in TempRootTableMap.

          ===========

          The pseudo code for the map factory is roughly as follows. Undoubtedly I've
          missed some things:

          start()
          {
          accept table name(s) (also optionally accept catalog and schema names)
          create TempMap object
          processRootTables(table_names)
          call TempMap.createClassMapsFromTableMaps (not yet written)
          call TempMap.convertTemp (already written)
          }

          processRootTables(String[]/Vector table_names)
          {
          for each table name
          {
          create TempRootTableMap
          add root table map to tempmap
          processTable(table_name)
          }
          }

          processTable(table_name)
          {
          for each table name
          {
          create TempTable, TempTableMap
          add temptablemap to tempmap?
          processColumns(temptable, temptablemap)
          processChildTables(temptable, temptablemap)
          }
          }

          processColumns(temptable, temptablemap)
          {
          call DatabaseMetaData.getColumns(temptable.table_name)
          for each row in result set
          {
          create TempColumn, TempColumnMap
          add tempcolumn and tempcolumnmap to temptable and temptablemap
          }
          }

          processChildTables(temptable, temptablemap)
          {
          ResultSet rs
          rs = DatabaseMetaData.getExportedKeys(temptable.tablename)
          processForeignTables(rs, temptable, temptablemap)
          rs = DatabaseMetaData.getImportedKeys(temptable.tablename)
          processForeignTables(rs, temptable, temptablemap)
          }

          processForeignTables(rs, temptable, temptablemap)
          {
          for each row in the result set
          {
          add the primary key/foreign key information to the temptablemap
          call processTable for the child table
          }
          }

          ========

          A quick example:

          Consider the sales order sample shipped with XML-DBMS. This consists of four
          tables:

          Sales
          / \
          Lines Customers
          |
          Parts

          The user passes the Sales table name and the Orders ignorable root element
          type to the map factory. The map factory creates a TempRootTableMap for
          Sales (setting the ignorable root to Orders), then creates a TempTableMap
          and TempTableMap for Sales and TempColumns and TempColumnMaps for the
          columns in Sales. It then discovers the Lines and Customers tables (also the
          Parts table when processing Lines) and creates the Temp* objects for these.

          ================

          Things I've probably missed:

          1) IMPORTANT! It occurs to me that you might be able to create
          TempClassMaps, TempPropertyMaps, etc. instead of TempTableMaps,
          TempColumnMaps, etc. You should look into this, because it would save you a
          *lot* of time by not having to write TempMap.createClassMapsFromTableMaps
          (point 5 below). It would also make the code much more parallel to the other
          two map factories, which would simplify things greatly. (See point (1) in
          "Other comments", below, for possible strategies.) The only real
          disadvantage that I see would be losing the data type and nullability
          information, which might be useful in the future, such as for generating an
          XML Schema from the Map object.

          2) You need to make sure that the calls to DatabaseMetaData methods use the
          correct catalog and schema, if any.

          3) When processing child tables, you need to check for circular references.
          For example, if A has a child B, B has a child C, and C has a child A, you
          need to be careful that you stop processing when you hit the second A.
          Otherwise, you will have an infinite loop.

          You also need to make sure that, when you process a child table, you don't
          add a child table for its parent. For example, if A has a child B, then
          either getExportedKeys or getImportedKeys in DatabaseMetaData will return
          the keys linking A and B, depending on who exports/imports what.

          4) processForeignTables requires you to have information before it is
          normally created. In particular, when you are adding the key information,
          you need the TempColumn objects for the child table. However, these haven't
          been created yet because you haven't called processTable for the child table
          yet. You should be able to get around this apparent Catch-22 by creating
          them in processForeignTables and, in processColumns, creating them only if
          they don't already exist. For similar code, see addTempClassMap and
          getTempClassMap in TempMap and how these are called from
          MapFactory_MapDocument.

          5) You will need to write TempMap.createClassMapsFromTableMaps. This
          "inverts" the TempTable/Column view of the mapping and creates
          TempClass/Property objects. The basic flow is fairly straight-forward:
          TempTableMaps are converted to TempClassMaps (child table info is converted
          to TempRelatedClassMap and TempLinkInfo), TempRootTableMaps are converted to
          TempRootClassMaps, and TempColumnMaps are converted to TempPropertyMaps.
          However, the actual details tend to get a bit complex and you will
          undoubtedly encounter the same sorts of problems discussed in (3) -- needing
          objects before they exist. For ideas, look at
          TempMap.createTableMapsFromClassMaps.

          6) You will probably need to modify the existing Temp* classes -- adding
          constructors that suit your needs, etc.

          7) I have probably forgotten some of the information that you need to set in
          TempMap. Look at the other two map factories for ideas.

          8) You can either create elements-as-properties or attributes for columns.
          Personally, I prefer elements-as-properties, but there is a real potential
          for name collisions here. (It doesn't exist if you use attributes.) That is,
          if two different tables have the same column name, both will be mapped to
          the same element name. The best thing to do here is probably to check first
          for a name collision, then prefix the table name to the column name if it
          happens.

          9) Although you shouldn't implement it now, you should think about ways that
          the user could specify how deeply foreign key relationships should be
          followed and design your code accordingly. For example, in the XML-DBMS
          sales order example, a user might only want a map that includes the Lines
          tables but not the Parts and Customers tables.

          ===============

          Other comments:

          1) In answer to your question as to why not just directly create the XML
          mapping document, there is no reason you can't. The only drawback to doing
          this is that, if you then want a Map object, you must read the Map back in
          to MapFactory_MapDocument. While this is not a problem for applications like
          GenerateMap, which are simply tools that output a map as a starting point,
          it slows down applications that want to do on-the-fly generation of XML data
          from the database.

          (If you would prefer to create the map document instead of a Map object as
          described above, this would still be very useful code, as it could be used
          to drive a tool such as GenerateMap that would help people get started when
          building XML-DBMS applications.)

          One other point of interest here, and that is whether to build this as a DOM
          application or SAX Parser. To see the different, consider the Sales example.
          When you encounter the Sales table, you create a ClassMap element for it and
          PropertyMap elements for its columns. You then go looking for its child
          tables. When you discover the Lines table, you need to create a
          RelatedClassMap to link it to the Sales table. You also need to create a
          ClassMap for the Lines table itself.

          In a DOM application, you can do this immediately, placing the ClassMap Node
          at the correct place in the DOM tree. In a SAX Parser, you would need to
          cache the table name for later processing. However, this wouldn't be so bad
          -- all you would do is re-enter the top-level of the application in the same
          way you did for the Sales table. The only difference is that you would
          create a ToClassTable element in the ClassMap instead of a ToRootTable
          element. The process would continue as long as the current set of tables has
          children. (It may or may not help you here to realize that map documents are
          flat -- that is, all tables get a ClassMap element at the same level,
          regardless of the table hierarchy that exists in the database.)

          2) As another aside, the "XML APIs for Databases" article in Java World is
          much simpler than XML-DBMS, but is very limited in what it can do. In
          particular:

          a) It can only extract data from a single table. For example, in the Sales
          example, it would not be able to return information from both the Sales and
          Lines tables. (You could actually do this by overriding the
          getSelectorSQLStatement method, but the resulting data would be flat, not
          nested as one would expect.)

          XML-DBMS, on the other hand, was designed to model any hierarchical set of
          data in the database and return that data in the expected XML hierarchy.
          (Note that the key word here is "expected". XML-DBMS is definitely limited
          in the hierarchy that it can represent, as was pointed out by Ralf
          Schwarzwald's question about the capabilities of the map document.)

          b) It is limited in the mapping it can do -- always generating elements for
          data (never attributes), requiring you to override methods and recompile to
          use different names than are in the database, not supporting ordering of
          elements, never returning mixed content, etc.

          Anyway, if you've read this far, I apologize for going on for so long.
          Hopefully I've said something useful on the way.

          -- Ron

          >From: Phil Friedman <pfriedma@...>
          >Reply-To: xml-dbms@egroups.com
          >To: xml-dbms@egroups.com
          >Subject: Re: [xml-dbms] Did I miss something?
          >Date: Fri, 26 May 2000 08:29:23 -0400
          >
          >Ron,
          >
          >I'm back to thinking about creating maps from databases, This is a
          >learning process rather than a commitment to use xmldbms, but
          >can you help me a bit more? Can you point to the code in MapFactory_DTD
          >or MapFactory_MapDocument that I could start with?
          >
          >Here is my understanding: For step 1, loop or traverse the tree of
          >database tables to map with JDBC meta data. Would it be correct to think
          >of this loop/traverse as the SAX parsing of a map document?
          >
          >At "startColumn", construct a Column and ColumnMap.
          >At "endTable", construct a Table and TableMap containing the Columns
          >and ColumnMaps.
          >
          >Check out http://www.javaworld.com/javaworld/jw-01-2000/jw-01-dbxml_p.html
          >for the source of this idea.
          >
          >For step two, I'm clueless.
          >
          >But, why not just write the map XML straight from the database-SAX
          >events?
          >
          >Regards, Philip Friedman -- Terralink Software Systems -- 207-772-6500
          >x101

          [Ron's previous reply snipped]
          ________________________________________________________________________
          Get Your Private, Free E-mail from MSN Hotmail at http://www.hotmail.com
        Your message has been successfully submitted and would be delivered to recipients shortly.