In the web service project I'm working on at the moment there's the
potential for some large XML documents to be generated, say, up to 5MB. My
role is to take these documents and write the contents to a DB2 database on
an AS/400, for which I've been using XMLDBMS version 1. But, since we're
still in development, I haven't had one of the monster documents to deal
To cover my backside, I've also been looking into the DB2 XML extender. But,
if I'm reading the Redbooks and other documentation correctly, the XML
extender has some reasonably small size limits, such as a payload of no more
than 1MB and a DAD file of no more than 100K.
I've read elsewhere in the group that XMLDBMS can come up against
out-of-memory errors. Would the sort of documents I'm looking at cause this
sort of problem?
- You haven't really said what kind of documents you are using, but large
documents I have heard of seem to fall into three categories:
1) Repeating data. For example, you have thousands or millions of sales
orders in the same document or astronomical data. In this case, there is
generally a wrapper element around what really are a set of separate
documents or "rows" of data.
2) Related data. In this case, you simply have a huge amount of related
data. I have heard of financial transactions that require multiple MBs
of XML because of all the contextual information that must be
transmitted to process the actual transaction.
3) Documents. It is possible for documents (such as books) to be as
large as 5 MB. However, for this to happen, the documents would probably
need to include graphics encoded as Base64 -- 5 MB is a *lot* of text.
Because XML-DBMS uses DOM trees to represent XML documents, it has size
limitations. DOM trees are kept in memory and are larger than the
original document, so large documents can easily exceed available
memory. I'm not sure that 5 MB documents would cause problems on a
modern machine, though. Even if the DOM tree is 10 times larger than the
original document, this is still only 50 MB.
In any case, large documents with repeating data (case 1) can be easily
processed by "cutting" them into separate documents, each of which is
processed separately. For more information about this, see:
or section 7.1 of the IBM Redbook "XML for DB2 Information Integration":
Whether it is possible to process other types of large documents (cases
2 or 3) depends on whether it is possible to cut such documents into
separate pieces, each of which can be processed separately. (For
example, is it acceptable to insert sections of a book separately?) In
any case, you would probably need custom code to pre-process the
documents and break them into manageable pieces.
With respect to map sizes, the code that reads map documents is
SAX-based, so it has no limitations on document size. This code does,
however, construct map objects in memory and is therefore limited by
available memory. However, I have real trouble envisioning a map that
would cause memory problems. Anything that big would be associated with
much larger instance documents, and the real problem would be those
As to the DB2 XML Extender, I can't remember what size limitations it
has. I thought it was SAX-based, rather than DOM-based, so I'm not sure
why it has the restrictions you're finding. (That said, I'm not
surprised by them -- my general impression of the XML Extender was that
it was not the best piece of software IBM had ever developed ;)
As a final comment, if you're still in development, I'd suggest
upgrading to XML-DBMS version 2.0 alpha 3 (if possible). This has been
around for a couple of years now, is as stable as version 1.x, and has
lots more features. Should I ever find the time to do a final release of
version 2.0, you wouldn't be forced to upgrade, as I would continue to
support alpha 3.
Jo Fletcher wrote:
> In the web service project I'm working on at the moment there's the
> potential for some large XML documents to be generated, say, up to 5MB. My
> role is to take these documents and write the contents to a DB2 database on
> an AS/400, for which I've been using XMLDBMS version 1. But, since we're
> still in development, I haven't had one of the monster documents to deal
> with yet.
> To cover my backside, I've also been looking into the DB2 XML extender. But,
> if I'm reading the Redbooks and other documentation correctly, the XML
> extender has some reasonably small size limits, such as a payload of no more
> than 1MB and a DAD file of no more than 100K.
> I've read elsewhere in the group that XMLDBMS can come up against
> out-of-memory errors. Would the sort of documents I'm looking at cause this
> sort of problem?
> Craig Caulfield.