Loading ...
Sorry, an error occurred while loading the content.

Size Limitations

Expand Messages
  • Jo Fletcher
    Hello, In the web service project I m working on at the moment there s the potential for some large XML documents to be generated, say, up to 5MB. My role is
    Message 1 of 2 , Mar 6, 2005
    • 0 Attachment
      Hello,

      In the web service project I'm working on at the moment there's the
      potential for some large XML documents to be generated, say, up to 5MB. My
      role is to take these documents and write the contents to a DB2 database on
      an AS/400, for which I've been using XMLDBMS version 1. But, since we're
      still in development, I haven't had one of the monster documents to deal
      with yet.

      To cover my backside, I've also been looking into the DB2 XML extender. But,
      if I'm reading the Redbooks and other documentation correctly, the XML
      extender has some reasonably small size limits, such as a payload of no more
      than 1MB and a DAD file of no more than 100K.

      I've read elsewhere in the group that XMLDBMS can come up against
      out-of-memory errors. Would the sort of documents I'm looking at cause this
      sort of problem?

      Craig Caulfield.
    • Ronald Bourret
      You haven t really said what kind of documents you are using, but large documents I have heard of seem to fall into three categories: 1) Repeating data. For
      Message 2 of 2 , Mar 6, 2005
      • 0 Attachment
        You haven't really said what kind of documents you are using, but large
        documents I have heard of seem to fall into three categories:

        1) Repeating data. For example, you have thousands or millions of sales
        orders in the same document or astronomical data. In this case, there is
        generally a wrapper element around what really are a set of separate
        documents or "rows" of data.

        2) Related data. In this case, you simply have a huge amount of related
        data. I have heard of financial transactions that require multiple MBs
        of XML because of all the contextual information that must be
        transmitted to process the actual transaction.

        3) Documents. It is possible for documents (such as books) to be as
        large as 5 MB. However, for this to happen, the documents would probably
        need to include graphics encoded as Base64 -- 5 MB is a *lot* of text.

        Because XML-DBMS uses DOM trees to represent XML documents, it has size
        limitations. DOM trees are kept in memory and are larger than the
        original document, so large documents can easily exceed available
        memory. I'm not sure that 5 MB documents would cause problems on a
        modern machine, though. Even if the DOM tree is 10 times larger than the
        original document, this is still only 50 MB.

        In any case, large documents with repeating data (case 1) can be easily
        processed by "cutting" them into separate documents, each of which is
        processed separately. For more information about this, see:

        http://groups.yahoo.com/group/xml-dbms/message/3428

        or section 7.1 of the IBM Redbook "XML for DB2 Information Integration":


        http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246994.html?Open

        Whether it is possible to process other types of large documents (cases
        2 or 3) depends on whether it is possible to cut such documents into
        separate pieces, each of which can be processed separately. (For
        example, is it acceptable to insert sections of a book separately?) In
        any case, you would probably need custom code to pre-process the
        documents and break them into manageable pieces.

        With respect to map sizes, the code that reads map documents is
        SAX-based, so it has no limitations on document size. This code does,
        however, construct map objects in memory and is therefore limited by
        available memory. However, I have real trouble envisioning a map that
        would cause memory problems. Anything that big would be associated with
        much larger instance documents, and the real problem would be those
        instance documents.

        As to the DB2 XML Extender, I can't remember what size limitations it
        has. I thought it was SAX-based, rather than DOM-based, so I'm not sure
        why it has the restrictions you're finding. (That said, I'm not
        surprised by them -- my general impression of the XML Extender was that
        it was not the best piece of software IBM had ever developed ;)

        As a final comment, if you're still in development, I'd suggest
        upgrading to XML-DBMS version 2.0 alpha 3 (if possible). This has been
        around for a couple of years now, is as stable as version 1.x, and has
        lots more features. Should I ever find the time to do a final release of
        version 2.0, you wouldn't be forced to upgrade, as I would continue to
        support alpha 3.

        -- Ron

        Jo Fletcher wrote:

        > Hello,
        >
        > In the web service project I'm working on at the moment there's the
        > potential for some large XML documents to be generated, say, up to 5MB. My
        > role is to take these documents and write the contents to a DB2 database on
        > an AS/400, for which I've been using XMLDBMS version 1. But, since we're
        > still in development, I haven't had one of the monster documents to deal
        > with yet.
        >
        > To cover my backside, I've also been looking into the DB2 XML extender. But,
        > if I'm reading the Redbooks and other documentation correctly, the XML
        > extender has some reasonably small size limits, such as a payload of no more
        > than 1MB and a DAD file of no more than 100K.
        >
        > I've read elsewhere in the group that XMLDBMS can come up against
        > out-of-memory errors. Would the sort of documents I'm looking at cause this
        > sort of problem?
        >
        > Craig Caulfield.
      Your message has been successfully submitted and would be delivered to recipients shortly.