Loading ...
Sorry, an error occurred while loading the content.
 

Re: [xml-doc] best way to expess this in XML

Expand Messages
  • Paul Tremblay
    Thanks. This is a good suggestion. If I remember correctly, somone on an xslt mailing list was complaining about the type of XML you suggestion would generate.
    Message 1 of 3 , Mar 12, 2003
      Thanks. This is a good suggestion. If I remember correctly, somone on an
      xslt mailing list was complaining about the type of XML you suggestion
      would generate. His complaint was that his XSLT styles sheet had to look
      up a value somewhere else to make a transformation.

      However, I don't know that such a lookup should be that difficult.

      At any rate, your method keeps the XML more readable, and it forces
      certain styles to become structural. In other words, all paragraphs with
      bold letters and with 14 point font would look like:

      <pg123>Text</pg123>

      The tag "pg123" might really transfrom to <title>. The actual specifics
      of the style (bold, borders, etc) get subjugated to the content--or at
      least in theory.

      On the other hand, the script will involve a little more work to do it
      your way.

      Thanks

      Paul

      On Sun, Mar 09, 2003 at 11:24:38PM +0300, Oleg A. Paraschenko wrote:
      >
      > Hello!
      >
      > I suggest you to divide content and presentation and express formatting in tags. Instead of
      >
      > <paragraph-definition border-paragraph-bottom="shadowed-border|hairline|line-width:0.5">
      > or
      > <paragraph-definition border-paragraph-bottom-shadowed-border = 'true' border-paragraph-bottom-hairline = 'true' border-paragraph-bottom-line-width = ".5" >
      >
      > use something like
      >
      > ...
      > <para pgf="pgf123">....text content goes here</para>
      > ...
      > <paragraph-formattings>
      > ...
      > <pgf id="pgf123">
      > <border side="bottom" shadow="true" hairline="true" width=".5pt" />
      > <border side="left" />
      > ...
      > <margin side="left" width="10cm />
      > </pgf>
      > ...
      > </paragraph-formattings>
      >
      > Properties which expressed as tags are easier to process by XSLT. A lot of elements will make you DTD large, but you can again split it on content part and presentation part and import presentation part as entity into main DTD.
      >
      > Regards, Oleg
      >
      > On Wed, 5 Mar 2003 23:22:39 -0500
      > Paul Tremblay <phthenry@...> wrote:
      >
      > > I am writing a script that converts Microsoft RTF to XML, and would like
      > > to know the best way to expess data that deals with borders.
      > >
      > > The problem is that borders in RTF are very redundant. For example, the
      > > top border of a paragraph might be described as having a shadowed
      > > border, a hairline border, and a width of .5.
      > >
      > > Here are the two ways I can express this in my XML:
      > >
      > >
      > > <paragraph-definition border-paragraph-bottom="shadowed-border|hairline|line-width:0.5">
      > >
      > >
      > > or
      > >
      > > <paragraph-definition border-paragraph-bottom-shadowed-border = 'true' border-paragraph-bottom-hairline = 'true' border-paragraph-bottom-line-width = ".5" >
      > >
      > > In the first example I grouped all of the attributes in a single
      > > attribute. My thinking was that there is a function in xslt that can
      > > split lines. However, after I leafed through the Wrox bible, I
      > > discovered there was no such function.
      > >
      > > The second example seems easier to transform. However, there are
      > > something like 30 different types of border in RTF. There are at least 4
      > > types of borders for a paragraph, top, bottom, right, and left. This
      > > means that my DTD has to contain 90 attributes--just to deal with
      > > borders! I need 90 more attributes for borders in cells, and 90 more for
      > > borders in rows.
      > >
      > > As I stated above, border types tend to be redundant. So a top border
      > > might be described as 'double' and 'double-thickness'. The first
      > > description is for older RTF readers, which don't recognize
      > > 'double-thickness'; the second for newer ones. In transforming the XML,
      > > a user needs one or the other, but really not both.
      > >
      > > I am also thinking that most transformations won't want all of the
      > > border information. For example, a transformation will want to know if
      > > the border exists, and if it does, what its line width is.
      > >
      > > Any thoughts?
      > >
      > > Paul
      > >
      > >
      > > --
      > >
      > > ************************
      > > *Paul Tremblay *
      > > *phthenry@...*
      > > ************************
      > >
      > > -------------------------------------------------------------------
      > > Post a message: mailto:xml-doc@yahoogroups.com
      > > Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
      > > Switch to digest: mailto:xml-doc-digest@yahoogroups.com
      > > Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
      > > Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
      > > Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
      > > Read archived messages: http://groups.yahoo.com/messages/xml-doc/
      > > -------------------------------------------------------------------
      > >
      > > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      > >
      > >
      >
      >
      > --
      > --
      > Oleg Paraschenko olpa@ http://bitplant.de/ - IT Services company
      > SGML/XML/Content management/WWW/Databases/Win32/Plug-ins/Scripts
      >
      >
      > -------------------------------------------------------------------
      > Post a message: mailto:xml-doc@yahoogroups.com
      > Unsubscribe: mailto:xml-doc-unsubscribe@yahoogroups.com
      > Switch to digest: mailto:xml-doc-digest@yahoogroups.com
      > Put mail on hold: mailto:xml-doc-nomail@yahoogroups.com
      > Contact adminstrator: mailto:xml-doc-owner@yahoogroups.com
      > Make changes via Web: http://groups.yahoo.com/subscribe/xml-doc/
      > Read archived messages: http://groups.yahoo.com/messages/xml-doc/
      > -------------------------------------------------------------------
      >
      > Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
      >

      --

      ************************
      *Paul Tremblay *
      *phthenry@...*
      ************************
    Your message has been successfully submitted and would be delivered to recipients shortly.