Loading ...
Sorry, an error occurred while loading the content.

Re: [xml-doc] XML database vs. file based repository

Expand Messages
  • Werner Donné
    Grant, My answer to Camille s question has perhaps reduced the issue to mere storage and retrieval aspects. Camille did indeed mention XML databases, which
    Message 1 of 7 , Nov 10, 2006
    • 0 Attachment
      Grant,

      My answer to Camille's question has perhaps reduced the issue
      to mere storage and retrieval aspects. Camille did indeed
      mention XML databases, which implies a conversion step if the
      document source is not in XML. In that area I have less
      experience, but if the input is not structured it is bound to
      be difficult. The only experience I have is with a system I
      made where the documents, which are in a particular XML
      vocabulary, are converted to WordML and back on the fly. This
      is possible because the generation of WordML is under control.
      Some "structure" can be introduced through Word styles. It is
      not a great success, however, because Word is only a visual
      tool, i.e. it may transform the document internally in a
      visually compatible way, which easily breaks certain conversions.

      I'm not a Framemaker user, but I know there is way to migrate
      from unstructured to structured Framemaker (http://www.adobe.com/products/framemaker/pdfs/migrationguide.pdf).
      As described in this document it can be easy or painful depending
      on the document consistency. If you have used formatting templates
      this will be used to propose conversions, a bit like my Word example
      above. Formatting will probably be lost, but you have to create
      new style sheets anyway when you start in XML.

      Regards,

      Werner.

      Grant Hogarth wrote:
      >
      >
      > Werner --
      > I've been trying to decide the same thing as Camille, but after a number
      > of experiments, I'm less sure.
      > I've got a reference document (currently in unstructured FrameMaker)
      > that would seem perfect for XML, but migrating it has proven to be a
      > royal pain. The closest I've gotten to an effective translation has been
      > to hand-code every entry (which is neither time-efficient, nor likely to
      > be error-proof).
      >
      > Suggestions?
      >
      > Grant
      >
      > -----Original Message-----
      > >From: Werner Donn� <werner.donne@... <mailto:werner.donne%40re.be>>
      > >Sent: Nov 10, 2006 2:02 AM
      > >To: xml-doc@yahoogroups.com <mailto:xml-doc%40yahoogroups.com>
      > >Subject: Re: [xml-doc] XML database vs. file based repository
      > >
      > >Hi Camille,
      > >
      > >Your document sources deserve to be in a database, because
      > >then they are protected by a transactional environment.
      > >While such a database should be equipped with the standard
      > >XML functionality, I doubt it should be an XML database.
      > >A document is rarely composed of XML entities alone.
      > >
      > >With respect to speed it is important to know what you do
      > >with the documents. There is no reason for a database to
      > >be much slower than an ordinary file system, certainly not
      > >in terms of raw I/O. Look-ups of files might be slower, but
      > >this largely depends on the additional business logic that
      > >may be in the system. If you are simply interacting with a
      > >repository in a database through explorer-type navigation,
      > >fetching and saving files, you shouldn't notice much
      > >difference. When accessing thousands of files in a short
      > >time (many look-ups) you might.
      > >
      > >The compatibility of a repository is, in my opinion, not
      > >affected by the medium the documents are stored in. This
      > >should not be visible for client software. The central
      > >compatibility issue is the interface of the repository.
      > >A repository in a database with a WebDAV interface, for
      > >example, is quite compatible.
      > >
      > >Regards,
      > >
      > >Werner.
      > >--
      > >Werner Donn� -- Re
      > >Engelbeekstraat 8
      > >B-3300 Tienen
      > >tel: (+32) 486 425803 e-mail: werner.donne@...
      > <mailto:werner.donne%40re.be>
      > >====================================
      > >Camille B�gnis wrote:
      > >>
      > >> Bonjour,
      > >>
      > >> with the increased availability of serious XML databases around, one
      > >> might be tempted to store his documentation sources into such a database
      > >> instead of a traditional file based repository.
      > >>
      > >> Did any of you already tested this?
      > >>
      > >> What can be the pros and cons in terms of speed, accessibility,
      > >> compatibility, etc. ?
      > >>
      > >> Thanks for your feedback,
      > >>
      > >> Camille.
      >
      >

      --
      Werner Donné -- Re
      Engelbeekstraat 8
      B-3300 Tienen
      tel: (+32) 486 425803 e-mail: werner.donne@...
    • Camille Bégnis
      ... Hash: SHA1 Hi, and thanks all for your answers, in order to better focus my question: We are in the process of designing an XML CMS from scratch. Meant to
      Message 2 of 7 , Nov 11, 2006
      • 0 Attachment
        -----BEGIN PGP SIGNED MESSAGE-----
        Hash: SHA1

        Hi,

        and thanks all for your answers, in order to better focus my question:

        We are in the process of designing an XML CMS from scratch. Meant to be
        as generic as possible, so there is actually no migration to worry about
        so far.

        To serve as a repository many options arise:
        - - subversion
        - - JSR170 compliant repository
        - - XML database
        - - [Your choice here]

        If you were given the choice (within FLOSS), what would you choose?

        Best regards,

        Camille.

        Werner Donné a écrit :
        >
        >
        > Grant,
        >
        > My answer to Camille's question has perhaps reduced the issue
        > to mere storage and retrieval aspects. Camille did indeed
        > mention XML databases, which implies a conversion step if the
        > document source is not in XML. In that area I have less
        > experience, but if the input is not structured it is bound to
        > be difficult. The only experience I have is with a system I
        > made where the documents, which are in a particular XML
        > vocabulary, are converted to WordML and back on the fly. This
        > is possible because the generation of WordML is under control.
        > Some "structure" can be introduced through Word styles. It is
        > not a great success, however, because Word is only a visual
        > tool, i.e. it may transform the document internally in a
        > visually compatible way, which easily breaks certain conversions.
        >
        > I'm not a Framemaker user, but I know there is way to migrate
        > from unstructured to structured Framemaker
        > (http://www.adobe.com/products/framemaker/pdfs/migrationguide.pdf
        > <http://www.adobe.com/products/framemaker/pdfs/migrationguide.pdf>).
        > As described in this document it can be easy or painful depending
        > on the document consistency. If you have used formatting templates
        > this will be used to propose conversions, a bit like my Word example
        > above. Formatting will probably be lost, but you have to create
        > new style sheets anyway when you start in XML.
        >
        > Regards,
        >
        > Werner.
        >
        > Grant Hogarth wrote:
        >>
        >>
        >> Werner --
        >> I've been trying to decide the same thing as Camille, but after a number
        >> of experiments, I'm less sure.
        >> I've got a reference document (currently in unstructured FrameMaker)
        >> that would seem perfect for XML, but migrating it has proven to be a
        >> royal pain. The closest I've gotten to an effective translation has been
        >> to hand-code every entry (which is neither time-efficient, nor likely to
        >> be error-proof).
        >>
        >> Suggestions?
        >>
        >> Grant
        >>
        >> -----Original Message-----
        >> >From: Werner Donn� <werner.donne@... <mailto:werner.donne%40re.be>
        > <mailto:werner.donne%40re.be>>
        >> >Sent: Nov 10, 2006 2:02 AM
        >> >To: xml-doc@yahoogroups.com <mailto:xml-doc%40yahoogroups.com>
        > <mailto:xml-doc%40yahoogroups.com>
        >> >Subject: Re: [xml-doc] XML database vs. file based repository
        >> >
        >> >Hi Camille,
        >> >
        >> >Your document sources deserve to be in a database, because
        >> >then they are protected by a transactional environment.
        >> >While such a database should be equipped with the standard
        >> >XML functionality, I doubt it should be an XML database.
        >> >A document is rarely composed of XML entities alone.
        >> >
        >> >With respect to speed it is important to know what you do
        >> >with the documents. There is no reason for a database to
        >> >be much slower than an ordinary file system, certainly not
        >> >in terms of raw I/O. Look-ups of files might be slower, but
        >> >this largely depends on the additional business logic that
        >> >may be in the system. If you are simply interacting with a
        >> >repository in a database through explorer-type navigation,
        >> >fetching and saving files, you shouldn't notice much
        >> >difference. When accessing thousands of files in a short
        >> >time (many look-ups) you might.
        >> >
        >> >The compatibility of a repository is, in my opinion, not
        >> >affected by the medium the documents are stored in. This
        >> >should not be visible for client software. The central
        >> >compatibility issue is the interface of the repository.
        >> >A repository in a database with a WebDAV interface, for
        >> >example, is quite compatible.
        >> >
        >> >Regards,
        >> >
        >> >Werner.
        >> >--
        >> >Werner Donn� -- Re
        >> >Engelbeekstraat 8
        >> >B-3300 Tienen
        >> >tel: (+32) 486 425803 e-mail: werner.donne@...
        > <mailto:werner.donne%40re.be>
        >> <mailto:werner.donne%40re.be>
        >> >====================================
        >> >Camille B�gnis wrote:
        >> >>
        >> >> Bonjour,
        >> >>
        >> >> with the increased availability of serious XML databases around, one
        >> >> might be tempted to store his documentation sources into such a
        > database
        >> >> instead of a traditional file based repository.
        >> >>
        >> >> Did any of you already tested this?
        >> >>
        >> >> What can be the pros and cons in terms of speed, accessibility,
        >> >> compatibility, etc. ?
        >> >>
        >> >> Thanks for your feedback,
        >> >>
        >> >> Camille.
        >>
        >>
        >
        > --
        > Werner Donné -- Re
        > Engelbeekstraat 8
        > B-3300 Tienen
        > tel: (+32) 486 425803 e-mail: werner.donne@...
        > <mailto:werner.donne%40re.be>
        >
        >
        -----BEGIN PGP SIGNATURE-----
        Version: GnuPG v1.4.5 (GNU/Linux)
        Comment: Using GnuPG with Mandriva - http://enigmail.mozdev.org

        iD8DBQFFVb8mjv9P65BfOUMRAgl2AKC4fuNvVaQKbWrPbNm+BnKfuEzHvwCeMRzG
        y2KZymSu17u0j6smwHZDsh0=
        =z2zq
        -----END PGP SIGNATURE-----


        [Non-text portions of this message have been removed]
      • Michael(tm) Smith
        ... If you haven t already, you might want to take some time to look at XIRUSS-T, the demo ( toy ) XML CMS system that Eliot Kimber has put together:
        Message 3 of 7 , Nov 11, 2006
        • 0 Attachment
          Camille Bégnis <camille@...>, 2006-11-11 13:16 +0100:

          > We are in the process of designing an XML CMS from scratch. Meant to be
          > as generic as possible

          If you haven't already, you might want to take some time to look
          at XIRUSS-T, the demo ("toy") XML CMS system that Eliot Kimber has
          put together:

          http://xiruss-t.sourceforge.net/

          A toy XML-aware (but otherwise generic and extensible) content
          management system demonstrating how to do sophisticated
          management of versioned hyperdocuments with a focus on issues of
          import and export of compound documents (e.g., XInclude-based).

          The Xinclude-based Re-Use Support System, Toy (XIRUSS-T) is an
          experimental system intended to both demonstrate basic
          techniques in managing compound documents and provide a sandbox
          for exploring various content management and link management
          problems and their solutions.

          I think he put it together a couple of years but it has perhaps
          since evolved a bit from just being a toy system. Or maybe not.
          Anyway, I think he's been doing some new things with it again this
          summer and fall, and blogging about it:

          http://drmacros-xml-rants.blogspot.com/

          To pull up the individual entries, search his blog for the string
          "XCMTDMW" (="XML Content Management the Dr. Macro Way") and
          "XIRUSS-T" (btw, does blogger.com not support real tags?)

          Reading it all ought to keep you busy for a while :)

          --Mike

          We will call you XIRUSS-T
          The god of Reuse you shall be
        • Werner Donné
          Hi Camille, I would recommend a WebDAV-compliant repository, but one that also implements DeltaV, the versioning part and access control lists. If you
          Message 4 of 7 , Nov 13, 2006
          • 0 Attachment
            Hi Camille,

            I would recommend a WebDAV-compliant repository, but one that
            also implements DeltaV, the versioning part and access control
            lists. If you standardise on a protocol you have more freedom
            with respect to technology for the software that interacts
            with the repository.

            Best regards,

            Werner.

            Camille Bégnis wrote:
            >
            >
            > -----BEGIN PGP SIGNED MESSAGE-----
            > Hash: SHA1
            >
            > Hi,
            >
            > and thanks all for your answers, in order to better focus my question:
            >
            > We are in the process of designing an XML CMS from scratch. Meant to be
            > as generic as possible, so there is actually no migration to worry about
            > so far.
            >
            > To serve as a repository many options arise:
            > - - subversion
            > - - JSR170 compliant repository
            > - - XML database
            > - - [Your choice here]
            >
            > If you were given the choice (within FLOSS), what would you choose?
            >
            > Best regards,
            >
            > Camille.
            >
            > Werner Donné a écrit :
            > >
            > >
            > > Grant,
            > >
            > > My answer to Camille's question has perhaps reduced the issue
            > > to mere storage and retrieval aspects. Camille did indeed
            > > mention XML databases, which implies a conversion step if the
            > > document source is not in XML. In that area I have less
            > > experience, but if the input is not structured it is bound to
            > > be difficult. The only experience I have is with a system I
            > > made where the documents, which are in a particular XML
            > > vocabulary, are converted to WordML and back on the fly. This
            > > is possible because the generation of WordML is under control.
            > > Some "structure" can be introduced through Word styles. It is
            > > not a great success, however, because Word is only a visual
            > > tool, i.e. it may transform the document internally in a
            > > visually compatible way, which easily breaks certain conversions.
            > >
            > > I'm not a Framemaker user, but I know there is way to migrate
            > > from unstructured to structured Framemaker
            > > (http://www.adobe.com/products/framemaker/pdfs/migrationguide.pdf
            > <http://www.adobe.com/products/framemaker/pdfs/migrationguide.pdf>
            > > <http://www.adobe.com/products/framemaker/pdfs/migrationguide.pdf
            > <http://www.adobe.com/products/framemaker/pdfs/migrationguide.pdf>>).
            > > As described in this document it can be easy or painful depending
            > > on the document consistency. If you have used formatting templates
            > > this will be used to propose conversions, a bit like my Word example
            > > above. Formatting will probably be lost, but you have to create
            > > new style sheets anyway when you start in XML.
            > >
            > > Regards,
            > >
            > > Werner.
            > >
            > > Grant Hogarth wrote:
            > >>
            > >>
            > >> Werner --
            > >> I've been trying to decide the same thing as Camille, but after a number
            > >> of experiments, I'm less sure.
            > >> I've got a reference document (currently in unstructured FrameMaker)
            > >> that would seem perfect for XML, but migrating it has proven to be a
            > >> royal pain. The closest I've gotten to an effective translation has been
            > >> to hand-code every entry (which is neither time-efficient, nor likely to
            > >> be error-proof).
            > >>
            > >> Suggestions?
            > >>
            > >> Grant
            > >>
            > >> -----Original Message-----
            > >> >From: Werner Donn� <werner.donne@...
            > <mailto:werner.donne%40re.be> <mailto:werner.donne%40re.be>
            > > <mailto:werner.donne%40re.be>>
            > >> >Sent: Nov 10, 2006 2:02 AM
            > >> >To: xml-doc@yahoogroups.com <mailto:xml-doc%40yahoogroups.com>
            > <mailto:xml-doc%40yahoogroups.com>
            > > <mailto:xml-doc%40yahoogroups.com>
            > >> >Subject: Re: [xml-doc] XML database vs. file based repository
            > >> >
            > >> >Hi Camille,
            > >> >
            > >> >Your document sources deserve to be in a database, because
            > >> >then they are protected by a transactional environment.
            > >> >While such a database should be equipped with the standard
            > >> >XML functionality, I doubt it should be an XML database.
            > >> >A document is rarely composed of XML entities alone.
            > >> >
            > >> >With respect to speed it is important to know what you do
            > >> >with the documents. There is no reason for a database to
            > >> >be much slower than an ordinary file system, certainly not
            > >> >in terms of raw I/O. Look-ups of files might be slower, but
            > >> >this largely depends on the additional business logic that
            > >> >may be in the system. If you are simply interacting with a
            > >> >repository in a database through explorer-type navigation,
            > >> >fetching and saving files, you shouldn't notice much
            > >> >difference. When accessing thousands of files in a short
            > >> >time (many look-ups) you might.
            > >> >
            > >> >The compatibility of a repository is, in my opinion, not
            > >> >affected by the medium the documents are stored in. This
            > >> >should not be visible for client software. The central
            > >> >compatibility issue is the interface of the repository.
            > >> >A repository in a database with a WebDAV interface, for
            > >> >example, is quite compatible.
            > >> >
            > >> >Regards,
            > >> >
            > >> >Werner.
            > >> >--
            > >> >Werner Donn� -- Re
            > >> >Engelbeekstraat 8
            > >> >B-3300 Tienen
            > >> >tel: (+32) 486 425803 e-mail: werner.donne@...
            > <mailto:werner.donne%40re.be>
            > > <mailto:werner.donne%40re.be>
            > >> <mailto:werner.donne%40re.be>
            > >> >====================================
            > >> >Camille B�gnis wrote:
            > >> >>
            > >> >> Bonjour,
            > >> >>
            > >> >> with the increased availability of serious XML databases around, one
            > >> >> might be tempted to store his documentation sources into such a
            > > database
            > >> >> instead of a traditional file based repository.
            > >> >>
            > >> >> Did any of you already tested this?
            > >> >>
            > >> >> What can be the pros and cons in terms of speed, accessibility,
            > >> >> compatibility, etc. ?
            > >> >>
            > >> >> Thanks for your feedback,
            > >> >>
            > >> >> Camille.
            > >>
            > >>
            > >
            > > --
            > > Werner Donné -- Re
            > > Engelbeekstraat 8
            > > B-3300 Tienen
            > > tel: (+32) 486 425803 e-mail: werner.donne@...
            > <mailto:werner.donne%40re.be>
            > > <mailto:werner.donne%40re.be>
            > >
            > >
            > -----BEGIN PGP SIGNATURE-----
            > Version: GnuPG v1.4.5 (GNU/Linux)
            > Comment: Using GnuPG with Mandriva - http://enigmail.mozdev.org
            > <http://enigmail.mozdev.org>
            >
            > iD8DBQFFVb8mjv9P65BfOUMRAgl2AKC4fuNvVaQKbWrPbNm+BnKfuEzHvwCeMRzG
            > y2KZymSu17u0j6smwHZDsh0=
            > =z2zq
            > -----END PGP SIGNATURE-----
            >
            > [Non-text portions of this message have been removed]
            >
            >

            --
            Werner Donné -- Re
            Engelbeekstraat 8
            B-3300 Tienen
            tel: (+32) 486 425803 e-mail: werner.donne@...
          Your message has been successfully submitted and would be delivered to recipients shortly.