Loading ...
Sorry, an error occurred while loading the content.

[OT] Searchable RSRDs (was Re: [pcgen-xml] Re: Progress report...)

Expand Messages
  • Frugal
    ... Hash: SHA1 ... I ended up writing a java app to read in the content and using the names of all of the nodes as the items to link to.
    Message 1 of 10 , Aug 11, 2004
    • 0 Attachment
      -----BEGIN PGP SIGNED MESSAGE-----
      Hash: SHA1


      <quote who="andargor">
      > It's straight HTML with a custom javascript search engine. There's a
      > Windows Help File version. I'm working on hyperlinking with context.
      > It's got some "dumb" hyperlinking right now, but it's far from
      > perfect (e.g. the difference between the Air subtype and the Air
      > domain, the word Air is linked where?)

      I ended up writing a java app to read in the content and using the names
      of all of the nodes as the items to link to. I did not do any linguistic
      calculations so I have phrases like "First aid" where the 'aid' is a link
      to teh 'Aid' spell ;O)

      > I am a very lazy person, hence I hate to do manual editing of
      > documents. So, I have a bunch of tools to convert the RSRD from RTF-
      >>XML->HTML. If wizards correct it or issue other files, I push it
      > through the grinder and out pops the HTML version.

      Those I would like to see. I thought about doing something similar, but
      the inconsistancy in the RTF made it too problematic.

      > As usual, projects are a learning opportunity, so I learned a heck of
      > a lot about search engines, and linguistics (for the context-based
      > hyperlinking)

      Are you going to make the tools you used to do this available ? I would be
      interested in seeing how you performed the hyperlinking.

      > I'd like to see your version, if it's available anywhere? Might give
      > me ideas.

      http://www.purplewombat.co.uk/~frugal/srd_treepad.zip

      You will need the treepad viewer as well:

      http://www.nlsoftware.com/download/a/?fl=tpviewer.zip

      I ended doing a lot of it by hand as the RSRD is very badly laid out. They
      have just dumped the 6 books to RTF so the same information is in several
      places. i.e. the details of the concentration skill are in the Skill, Epic
      Skills, Divine Skills and Psionic Skills sections. So I combined all of
      that together to make one cohesive document. If you want to know about the
      concentration skill you only need to look in one place...

      - --
      regards,
      Frugal
      - -OS Chimp

      -----BEGIN PGP SIGNATURE-----
      Version: GnuPG v1.0.6 (GNU/Linux)
      Comment: For info see http://www.gnupg.org

      iEYEARECAAYFAkEaLKAACgkQjxkml+JxXTrDKQCg9HCoXxyu9o8ClxZn2UgxtlYd
      e1QAoOoFlCPBNfR42f8Hf9u1sXu5bjAH
      =spga
      -----END PGP SIGNATURE-----
    • andargor
      ... would be ... That s what I m working on now. The version on my site has about the same technique you used. A lot of regexp s. Not very bright. The text is
      Message 2 of 10 , Aug 11, 2004
      • 0 Attachment
        --- In pcgen-xml@yahoogroups.com, "Frugal" <frugal@p...> wrote:
        > Are you going to make the tools you used to do this available ? I
        would be
        > interested in seeing how you performed the hyperlinking.
        >

        That's what I'm working on now. The version on my site has about the
        same technique you used. A lot of regexp's. Not very bright.

        The text is broken down into topics, which may be embedded. I had to
        use minimal manual stuff here: a "boundary" file so that I can cut
        the RTF into pieces, and establish a hierarchy. Again, lots of
        regexp's...

        The rest of the tools are a mish mash of perl and C tools. I do
        intend to release them when I'm done (and when I've cleaned up the
        terrible hacks in there...)


        > > I'd like to see your version, if it's available anywhere? Might
        give
        > > me ideas.
        >
        > http://www.purplewombat.co.uk/~frugal/srd_treepad.zip
        >
        > You will need the treepad viewer as well:
        >
        > http://www.nlsoftware.com/download/a/?fl=tpviewer.zip


        Forbidden :) Nothing a little chmod a+r might not fix...



        > I ended doing a lot of it by hand as the RSRD is very badly laid
        out. They
        > have just dumped the 6 books to RTF so the same information is in
        several
        > places. i.e. the details of the concentration skill are in the
        Skill, Epic
        > Skills, Divine Skills and Psionic Skills sections. So I combined
        all of
        > that together to make one cohesive document. If you want to know
        about the
        > concentration skill you only need to look in one place...


        Argh, I intended to do that eventually, but I just realized my
        categorization woes might be less if I combined similar topics (e.g.
        Listen in Skills and in Epic Skills). So stupid, gotta do that
        _before_ I hyperlink.

        Andargor
      • Frugal
        ... Ah! Because I broke the topics down completely (1 class per page, 1 feat per page etc) I could give each node it s own name and then search through all of
        Message 3 of 10 , Aug 11, 2004
        • 0 Attachment
          On Wednesday 11 August 2004 16:47, andargor wrote:
          > --- In pcgen-xml@yahoogroups.com, "Frugal" <frugal@p...> wrote:
          > > Are you going to make the tools you used to do this available ? I
          >
          > would be
          >
          > > interested in seeing how you performed the hyperlinking.
          >
          > That's what I'm working on now. The version on my site has about the
          > same technique you used. A lot of regexp's. Not very bright.
          >
          > The text is broken down into topics, which may be embedded. I had to
          > use minimal manual stuff here: a "boundary" file so that I can cut
          > the RTF into pieces, and establish a hierarchy. Again, lots of
          > regexp's...

          Ah! Because I broke the topics down completely (1 class per page, 1 feat per
          page etc) I could give each node it's own name and then search through all of
          the pages for that node name

          I think I have something like 4000 nodes and 27000 links ;O) I am currently
          scanning in all of the other books that I own so that I can add them in as
          well. I intend to have all of the magazines, books, articles PDFs etc all in
          one place.

          > > http://www.purplewombat.co.uk/~frugal/srd_treepad.zip

          > Forbidden :) Nothing a little chmod a+r might not fix...

          Doh! Fixed.

          --
          regards,
          Frugal
          -OS Chimp
        • andargor
          ... feat per ... through all of ... If you want to take a look at what the XML looks like, once it s been scrubbed clean:
          Message 4 of 10 , Aug 11, 2004
          • 0 Attachment
            --- In pcgen-xml@yahoogroups.com, Frugal <frugal@p...> wrote:
            > Ah! Because I broke the topics down completely (1 class per page, 1
            feat per
            > page etc) I could give each node it's own name and then search
            through all of
            > the pages for that node name

            If you want to take a look at what the XML looks like, once it's
            been "scrubbed" clean:

            http://www.andargor.com/files/srd35-xml.tar.gz

            That's my base for generating the keyword links and then XSLT it into
            HTML.

            Andargor
          • Keith Davies
            ... Wow. Is it entirely presentational? I ve been working with text files derived directly from the RTF. Tables get entirely munged and they were *quite* as
            Message 5 of 10 , Aug 16, 2004
            • 0 Attachment
              On Wed, Aug 11, 2004 at 07:03:06PM +0000, andargor wrote:
              > --- In pcgen-xml@yahoogroups.com, Frugal <frugal@p...> wrote:
              > > Ah! Because I broke the topics down completely (1 class per page, 1
              > feat per
              > > page etc) I could give each node it's own name and then search
              > through all of
              > > the pages for that node name
              >
              > If you want to take a look at what the XML looks like, once it's
              > been "scrubbed" clean:
              >
              > http://www.andargor.com/files/srd35-xml.tar.gz
              >
              > That's my base for generating the keyword links and then XSLT it into
              > HTML.

              Wow. Is it entirely presentational?

              I've been working with text files derived directly from the RTF. Tables
              get entirely munged and they were *quite* as consistent as I'd like with
              formatting and layout. It would make it a lot easier to hack with Perl
              if they had been.

              I'm trying to get it so it's actually context-aware rather than
              presentational. I've done the feats and got a start on spells (I don't
              have all the XML -> HTML stuff done). Haven't done anything with
              classes yet. Skills I have a start, but -- and this is the problem with
              almost everything -- I have to fix tables. Word exported a table

              Level BAB Fort Ref Will
              1 +1 +2 +0 +0
              2 +2 +3 +0 +0
              3 +3 +3 +1 +1

              Thus:

              Level
              BAB
              Fort
              Ref
              Will
              1
              +1
              +2
              +0
              +0
              2
              +2
              +3
              +0
              +0
              3
              +3
              +3
              +1
              +1

              This makes it a little challenging sometimes :/

              If anyone has any suggestions, they'd be more than welcome.


              Keith
              --
              Keith Davies I gave my 2yo daughter a strawberry
              keith.davies@... Naomi: "Strawberry!"
              me: "What do you say?"
              Naomi: "*MY* strawberry!"
            • andargor
              ... (snip...) Mostly presentational. The topics are in right now to help automatic hyperlinking (which is hard as hell). This needs some context- awareness.
              Message 6 of 10 , Aug 16, 2004
              • 0 Attachment
                --- In pcgen-xml@yahoogroups.com, Keith Davies <keith.davies@k...>
                wrote:
                >
                > Wow. Is it entirely presentational?
                >
                (snip...)

                Mostly presentational. The topics are in right now to help automatic
                hyperlinking (which is hard as hell). This needs some context-
                awareness.

                >
                > I'm trying to get it so it's actually context-aware rather than
                > presentational. I've done the feats and got a start on spells (I
                don't
                > have all the XML -> HTML stuff done). Haven't done anything with
                > classes yet. Skills I have a start, but -- and this is the problem
                with
                > almost everything -- I have to fix tables. Word exported a table
                >

                Here's the XSLT I use to convert those XML files to HTML:

                http://www.andargor.com/files/rtfx2html.gz

                I use xsltproc (from libxml2), but other processors should work.
                Tables get formatted correctly about 98% of the time (there are
                exceptions).

                Andargor
              • Keith Davies
                ... If nothing else that could save me a boatload of time. It wouldn t be so difficult to write an XSLT that ll dump the tables in files named something like
                Message 7 of 10 , Aug 16, 2004
                • 0 Attachment
                  On Mon, Aug 16, 2004 at 03:51:54PM +0000, andargor wrote:
                  > --- In pcgen-xml@yahoogroups.com, Keith Davies <keith.davies@k...>
                  > wrote:
                  > >
                  > > Wow. Is it entirely presentational?
                  > >
                  > (snip...)
                  >
                  > Mostly presentational. The topics are in right now to help automatic
                  > hyperlinking (which is hard as hell). This needs some context-
                  > awareness.
                  >
                  > >
                  > > I'm trying to get it so it's actually context-aware rather than
                  > > presentational. I've done the feats and got a start on spells (I
                  > don't
                  > > have all the XML -> HTML stuff done). Haven't done anything with
                  > > classes yet. Skills I have a start, but -- and this is the problem
                  > with
                  > > almost everything -- I have to fix tables. Word exported a table
                  > >
                  >
                  > Here's the XSLT I use to convert those XML files to HTML:
                  >
                  > http://www.andargor.com/files/rtfx2html.gz
                  >
                  > I use xsltproc (from libxml2), but other processors should work.
                  > Tables get formatted correctly about 98% of the time (there are
                  > exceptions).

                  If nothing else that could save me a boatload of time. It wouldn't be
                  so difficult to write an XSLT that'll dump the tables in files named
                  something like item-$num.xml.

                  You can see some of my results at http://www.kjdavies.org/rpg/reference

                  It's far from complete, of course, but I think that you XML will help.

                  The feats and whatnot making their way toward a functional XML. The
                  prereqs, for instance, are <prereq>s, not text. The ones that could be
                  easily translated, at least.


                  Keith
                  --
                  Keith Davies I gave my 2yo daughter a strawberry
                  keith.davies@... Naomi: "Strawberry!"
                  me: "What do you say?"
                  Naomi: "*MY* strawberry!"
                Your message has been successfully submitted and would be delivered to recipients shortly.