Loading ...
Sorry, an error occurred while loading the content.
 

Okay - getting down to business

Expand Messages
  • merton_monk
    I expect to have the OGL/D20 stuff finished off this weekend. We may have to go through a couple of iterations through the D20 acceptance process, but I
    Message 1 of 24 , Aug 23, 2002
      I expect to have the OGL/D20 stuff finished off this weekend. We may
      have to go through a couple of iterations through the D20 acceptance
      process, but I expect that to go fairly smoothly. From the comments
      I've had about a List Editor (as opposed to having everyone use a
      text editor) and ideas about making PCGen's data more available to
      other apps, moving to xml quickly would seem to kill an awful lot of
      birds at once. :)

      Keith is in charge of this project, so whether we use
      Schema/DTD/combination/whatever I'll let him decide. I haven't had
      much time to pay much attention to where things are right now with
      xml, so could Keith or someone fill me in? At the same time that
      would let everyone else know where things stand so we're all on the
      same page.

      Going forward I'd like to discuss how we implement xml.
      Convert all the files at once, or one at a time? Obviously there's a
      lot of work going on in the lst files, so we would need an easy way
      to convert them to xml. Ideally we convert all the files (.lst
      and .pcc) at once, though doing it in stages would be acceptable as
      well.

      Any thoughts on whether or not we should change the pcg files to xml
      format or not? That actually might be kind of nice since xml is
      fairly self-documenting. It's much lower in priority to the data
      conversion.

      I've done a ton of html stuff, though I've only done a couple of xml
      tutorials on the web. I haven't looked at the xml parsers, I'm
      assuming that the parser we'd use would depend on whether we went dom
      or sax, right?

      My thanks to everyone who has volunteered to help on this! I think
      this will take PCGen to a new level of user-friendliness, not to
      mention being much better at making sure that the data being supplied
      is properly done. Getting this done well would be a huge feather in
      our cap! I'll let you xml-gurus decide on the actual format of
      everything, I'll just pipe in with stuff from a Dictator's
      standpoint. :) I'll also make sure any java coding requirements
      receive a high priority to get this done in a timely manner.

      If anyone here is not an xml-guru and doesn't code in java, well...
      you can help document and test! Those are also very important tasks!

      -Bryan
    • Paul M. Lambert
      [snip] ... I am not an xml-guru and I don t code in Java, (and I am not soup and I do not come from a store) but I have this little server running FreeBSD that
      Message 2 of 24 , Aug 23, 2002
        [snip]

        >
        > If anyone here is not an xml-guru and doesn't code in java, well...
        > you can help document and test! Those are also very important tasks!
        >

        I am not an xml-guru and I don't code in Java, (and I am not soup and I
        do not come from a store) but I have this little server running FreeBSD
        that is available for use by the convertors.

        So once the project gets moving, if someone wants to write a Java or
        perl lst->xml convertor that needs to be tested, I can ensure it is run
        against current sources on a regular basis...

        --plambert
      • Scott Ellsworth
        ... Keith had dreams of changing the internal data format at the same time that we move to XML, as that move will require us to look at our schema anyway.
        Message 3 of 24 , Aug 23, 2002
          On Friday, August 23, 2002, at 01:12 PM, merton_monk wrote:

          > Going forward I'd like to discuss how we implement xml.
          > Convert all the files at once, or one at a time?

          Keith had dreams of changing the internal data format at the same time
          that we move to XML, as that move will require us to look at our schema
          anyway. Once we have done that work, the innards of pcgen can benefit
          a lot from it, so I fully support that dream.

          There is a good reason for this. Right now, we index and parse a LOT
          of strings, often with n^2 time. This means that some tabs take a
          visible amount of time to build, when they really should not. (If you
          see a code block that grabs an uppercased string, then compares it to
          every string in a collection with caseless equal, that is an area where
          this happens.)

          By making the hard and fast rule that things are referred to by keys,
          and that those keys are never, ever parsed more than once, we can get a
          LOT of performance. In essence, we should never need a string
          tokenizer once the program starts up, and the presence of one should
          tell us that we are parsing data that should already be in cooked
          format. One benefit of the XML conversion is that we will be creating
          those unique identifiers.

          Note - we may choose to add those keys for the XML task into the
          current lst files as just another tag during the conversion. That way,
          xml files will be able to refer precisely to something defined in a
          current lst file, and nothing will break when converted.

          Based on work I am doing for a client, I see no reason why loading the
          ten megabytes or so of full data we now posses into internal maps
          should take more than a few seconds. A display of a tenth level
          character, hitting perhaps a thousand or so data items should be well
          under a second as well, even on a Pentium 400. (This is what our stuff
          does, at least, and we are using xerces, jdom, and oodles of carefully
          designed hashmaps/treemaps on databases with tens of thousands of
          entries per table.)

          So, the consequence I see is this - we will end up converting classes
          of things, and we will want to have an auto-conversion script for the
          list files, as otherwise, we get to write both a converter _and_ a
          parser for the same data. Less fun than it could be.

          If there is a file type that we have not looked at, we might keep the
          data in the old format until we get to it, with the commensurate slow
          access. For example, let us assume that spells will be a pain.
          Further, they are pretty separate from the other files, as they do not
          grant bonuses. They would thus not have to be a bottleneck for bonus
          conversion. Contrawise, we could convert the spell reading/writing
          code first, without having to do a lot of work on the bonuses, to see
          if we get the performance we want.

          In summary, I expect a large number of files to go as a block, with
          other noninterdependent things connecting up as we get a chance. By
          deciding on our idref schema early and getting it into the lst files,
          we insulate ourselves from breakage.

          > Any thoughts on whether or not we should change the pcg files to xml
          > format or not? That actually might be kind of nice since xml is
          > fairly self-documenting. It's much lower in priority to the data
          > conversion.

          I suspect we will have to visit the format anyway, just because we will
          want to have the pcg files using the keys defined in the xml file,
          rather than strings that need exponential lookup. That said, I would
          recommend doing the characters shortly after the main files, just so
          that we can remove the older code as soon as possible.

          > I've done a ton of html stuff, though I've only done a couple of xml
          > tutorials on the web. I haven't looked at the xml parsers, I'm
          > assuming that the parser we'd use would depend on whether we went dom
          > or sax, right?

          Not as much as you might think. With JAXP, most parsers have ended up
          supporting both worlds. I tend to write my program's interface to
          support JDOM, which uses an underlying SAX parser for speed, but which
          exposes a DOM-like interface tuned to Java collections.

          For most of my clients, that parser has been Xerces, but a few have
          moved to Saxon. By writing to either the JAXP SAX interface, or to the
          generic JDOM interface, we are insulated from the underlying parser
          issue.

          Scott
        • Amedeo Paglione
          ... Hi, I would not define myself as an xml-guru but I have a good experience in XML/XSLT and Java programming. I would like to be of help for this great
          Message 4 of 24 , Aug 23, 2002
            On Friday 23 August 2002 22:12, merton_monk wrote:

            >
            > If anyone here is not an xml-guru and doesn't code in java, well...
            > you can help document and test! Those are also very important tasks!
            >

            Hi,
            I would not define myself as an 'xml-guru' but I have a good experience in
            XML/XSLT and Java programming. I would like to be of help for this great
            project.

            Is there any plan to have an i18n version of the PCGen once the migration to
            the xml based will be completed? I'm Italian and would be great to have this
            application working with meter, Kg and Italian (or other languages)
            descriptions.


            Thanks,
            Amedeo
          • merton_monk
            ... well... ... tasks! ... experience in ... great ... migration to ... have this ... There s actually a group that has been working on converting our existing
            Message 5 of 24 , Aug 24, 2002
              --- In pcgen-xml@y..., Amedeo Paglione <apaglione@i...> wrote:
              > On Friday 23 August 2002 22:12, merton_monk wrote:
              >
              > >
              > > If anyone here is not an xml-guru and doesn't code in java,
              well...
              > > you can help document and test! Those are also very important
              tasks!
              > >
              >
              > Hi,
              > I would not define myself as an 'xml-guru' but I have a good
              experience in
              > XML/XSLT and Java programming. I would like to be of help for this
              great
              > project.
              >
              > Is there any plan to have an i18n version of the PCGen once the
              migration to
              > the xml based will be completed? I'm Italian and would be great to
              have this
              > application working with meter, Kg and Italian (or other languages)
              > descriptions.

              There's actually a group that has been working on converting our
              existing data files into italian. If we can plan on
              internationalization (is that what i18n means?) from the get-go with
              xml, that would make their efforts much easier. Thanks for bringing
              that up!

              -Bryan


              >
              >
              > Thanks,
              > Amedeo
            • Keith Davies
              ... re: DTD vs. Schema. I use DTD for discussion purposes. There are reasons to use Schema when it goes to production, though, and I m leaning that way.
              Message 6 of 24 , Aug 26, 2002
                On Fri, Aug 23, 2002 at 08:12:35PM +0000, merton_monk wrote:
                >
                > Keith is in charge of this project, so whether we use
                > Schema/DTD/combination/whatever I'll let him decide. I haven't had
                > much time to pay much attention to where things are right now with
                > xml, so could Keith or someone fill me in? At the same time that
                > would let everyone else know where things stand so we're all on the
                > same page.

                re: DTD vs. Schema.

                I use DTD for discussion purposes. There are reasons to use Schema when
                it goes to production, though, and I'm leaning that way.

                Status:

                I've got a framework figured out for laying out the data in a consistent
                and flexible manner. It will also fix the dependency problems we've
                had, to the point where it should be impossible to *not* find everything
                needed[1] when it is needed. From here I'd like to start filling in
                some of the simpler items (such as skills) and work our way up to the
                complex (classes and characters).

                [1] assuming someone doesn't manually mess with the files containing the
                required information and that all required files are identified by
                the dependent files. And that someone doesn't try to put a circular
                dependency in.

                > Going forward I'd like to discuss how we implement xml.
                > Convert all the files at once, or one at a time? Obviously there's a
                > lot of work going on in the lst files, so we would need an easy way
                > to convert them to xml. Ideally we convert all the files (.lst
                > and .pcc) at once, though doing it in stages would be acceptable as
                > well.

                IIRC Chuck and... Eric, I think it was, were working on a Perl script
                that will take apart LST files and rebuild them in another format; that
                will probably go a long way toward getting the conversion actually done.
                I'd want a cutoff, though -- on X day we go from LST to XML. If we have
                to support both formats we'd be looking at a lot of pain, I think,
                though providing a tool to import the old LST files might be an
                acceptable workaround for those with custom files that don't get
                converted when we flip.

                Delays things, I know, but means that we won't be trying to maintain
                data in two disparate systems concurrently.

                > Any thoughts on whether or not we should change the pcg files to xml
                > format or not? That actually might be kind of nice since xml is
                > fairly self-documenting. It's much lower in priority to the data
                > conversion.

                Of course -- as far as I'm concerned, *everything* will go to XML.

                > I've done a ton of html stuff, though I've only done a couple of xml
                > tutorials on the web. I haven't looked at the xml parsers, I'm
                > assuming that the parser we'd use would depend on whether we went dom
                > or sax, right?

                Pretty much. For our purposes here I lean toward SAX -- use
                well-designed internal structures (not DOM) and you can probably expect
                better performance than if you do everything in DOM, and a *lot* smaller
                memory footprint[2]. It also lets us specially process certain
                elements as we find them, which I understand can't be done with DOM.

                I don't know enough about JDOM to offer a meaningful comment; I've been
                told that JDOM can do some of the things that SAX allows us to do.

                [2] One XML book I read mentioned that internal memory requirements
                have, in some instances, been about 15 times greater than the XML
                file size. Given the number of references (IDREF) we'll have
                floating around the system, being able to snap them and just hold
                the pointers instead will save us a *whole lotta* space alone, let
                along what heppens when we replace the lists of attributes with just
                the strings required to hold them.


                Keith
                --
                Keith Davies
                keith.davies@...

                PCGen: <reaper/>, smartass
                "While information might or might not want to be free, it definitely
                doesn't want to live under a DRM" -- Jonas, on PCGen
              • Keith Davies
                ... Off the top of my head, we should never need to parse things apart once they re loaded. Damage rolls (well, rolls in general) will have a fully-parsed
                Message 7 of 24 , Aug 26, 2002
                  On Fri, Aug 23, 2002 at 03:09:18PM -0700, Scott Ellsworth wrote:
                  >
                  > On Friday, August 23, 2002, at 01:12 PM, merton_monk wrote:
                  >
                  > > Going forward I'd like to discuss how we implement xml.
                  > > Convert all the files at once, or one at a time?
                  >
                  > Keith had dreams of changing the internal data format at the same time
                  > that we move to XML, as that move will require us to look at our schema
                  > anyway. Once we have done that work, the innards of pcgen can benefit
                  > a lot from it, so I fully support that dream.
                  >
                  > There is a good reason for this. Right now, we index and parse a LOT
                  > of strings, often with n^2 time. This means that some tabs take a
                  > visible amount of time to build, when they really should not. (If you
                  > see a code block that grabs an uppercased string, then compares it to
                  > every string in a collection with caseless equal, that is an area where
                  > this happens.)
                  >
                  > By making the hard and fast rule that things are referred to by keys,
                  > and that those keys are never, ever parsed more than once, we can get a
                  > LOT of performance. In essence, we should never need a string
                  > tokenizer once the program starts up, and the presence of one should
                  > tell us that we are parsing data that should already be in cooked
                  > format. One benefit of the XML conversion is that we will be creating
                  > those unique identifiers.

                  Off the top of my head, we should never need to parse things apart once
                  they're loaded. Damage rolls (well, rolls in general) will have a
                  fully-parsed format like

                  <roll num="1" size="6" adj="2"/>

                  which would (internally) look something like

                  class Roll {
                  int num;
                  int size;
                  int adj;
                  };

                  There'd be no string->number conversions, and an operation like bumping
                  the roll (like a large longsword) is pretty straightforward, no having
                  to parse and change the string.

                  IDs *might* even be optional once the data's loaded -- internally it'd
                  use pointers to the identified object rather than the ID. So,

                  class Class_Level {
                  // effects of taking this level, Effects not defined yet
                  };

                  class Class {
                  std::string name;
                  std::vector<Class_Level>
                  };

                  class Character_Level {
                  Class *class;
                  int hit_points;
                  };

                  class Character {
                  Race *race;
                  std::vector<Character_Level> levels;
                  };

                  I'm a C++ type normally, so I tend to think 'pointer'; substitute
                  references where appropriate, and ignore crappy style above 'cause I
                  don't normally write like that.

                  Anyway, once all the files are loaded the ID references wouldn't even be
                  needed any more, so it's not inconceivable that we just... throw the
                  map<>s away when we're done loading. I don't think I'd do that, myself,
                  in case the user decides to load something else, but it *could* be done.

                  > Note - we may choose to add those keys for the XML task into the
                  > current lst files as just another tag during the conversion. That way,
                  > xml files will be able to refer precisely to something defined in a
                  > current lst file, and nothing will break when converted.
                  >
                  > Based on work I am doing for a client, I see no reason why loading the
                  > ten megabytes or so of full data we now posses into internal maps
                  > should take more than a few seconds. A display of a tenth level
                  > character, hitting perhaps a thousand or so data items should be well
                  > under a second as well, even on a Pentium 400. (This is what our stuff
                  > does, at least, and we are using xerces, jdom, and oodles of carefully
                  > designed hashmaps/treemaps on databases with tens of thousands of
                  > entries per table.)

                  Yep, I expect internal data structures will be seriously impacted. Ever
                  seen the movie 'Armegeddon'? <g> It should all be for the good, though,
                  whichever way it's implemented internally.

                  > So, the consequence I see is this - we will end up converting classes
                  > of things, and we will want to have an auto-conversion script for the
                  > list files, as otherwise, we get to write both a converter _and_ a
                  > parser for the same data. Less fun than it could be.

                  The actual conversion exercise does not strike me as interesting,
                  exciting, or challenging enough that I'd want to do it more than once.
                  When we get there, let's just flip it over and be done with it.

                  > If there is a file type that we have not looked at, we might keep the
                  > data in the old format until we get to it, with the commensurate slow
                  > access. For example, let us assume that spells will be a pain.

                  Actually, spells should be pretty simple. *Weapons* are more complex
                  than spells -- spells have more fields, but they're *simpler*. Weapons
                  can actually get fairly complex. However, it'll do for your example <g>

                  > Further, they are pretty separate from the other files, as they do not
                  > grant bonuses. They would thus not have to be a bottleneck for bonus
                  > conversion. Contrawise, we could convert the spell reading/writing
                  > code first, without having to do a lot of work on the bonuses, to see
                  > if we get the performance we want.
                  >
                  > In summary, I expect a large number of files to go as a block, with
                  > other noninterdependent things connecting up as we get a chance. By
                  > deciding on our idref schema early and getting it into the lst files,
                  > we insulate ourselves from breakage.

                  Sound reasonable.

                  > > Any thoughts on whether or not we should change the pcg files to xml
                  > > format or not? That actually might be kind of nice since xml is
                  > > fairly self-documenting. It's much lower in priority to the data
                  > > conversion.
                  >
                  > I suspect we will have to visit the format anyway, just because we will
                  > want to have the pcg files using the keys defined in the xml file,
                  > rather than strings that need exponential lookup. That said, I would
                  > recommend doing the characters shortly after the main files, just so
                  > that we can remove the older code as soon as possible.

                  Yep.

                  > > I've done a ton of html stuff, though I've only done a couple of xml
                  > > tutorials on the web. I haven't looked at the xml parsers, I'm
                  > > assuming that the parser we'd use would depend on whether we went dom
                  > > or sax, right?
                  >
                  > Not as much as you might think. With JAXP, most parsers have ended up
                  > supporting both worlds. I tend to write my program's interface to
                  > support JDOM, which uses an underlying SAX parser for speed, but which
                  > exposes a DOM-like interface tuned to Java collections.
                  >
                  > For most of my clients, that parser has been Xerces, but a few have
                  > moved to Saxon. By writing to either the JAXP SAX interface, or to the
                  > generic JDOM interface, we are insulated from the underlying parser
                  > issue.

                  FWIW, I plan to build a lot of my code in C++ simply because I'm more
                  accustomed to it and can get more work done (let's see... design XML and
                  build little tools to prove it in a language I'm not familar with, or
                  design XML and use one of my l33t languages...). I use a cross-platform
                  library (fltk, 'cause I'm lazy) so I could even get to the point of
                  releasing my little tools under Linux and Win32 (not Mac because, well,
                  I don't have one. The library is supposed to work with it, though).


                  Keith
                  --
                  Keith Davies
                  keith.davies@...

                  PCGen: <reaper/>, smartass
                  "While information might or might not want to be free, it definitely
                  doesn't want to live under a DRM" -- Jonas, on PCGen
                • Keith Davies
                  ... I think the numeric qualities of the items are, well, numbers. I plan to support multiple names for things, and XML supports multibyte characters (accents
                  Message 8 of 24 , Aug 26, 2002
                    On Sat, Aug 24, 2002 at 03:59:29AM +0200, Amedeo Paglione wrote:
                    > On Friday 23 August 2002 22:12, merton_monk wrote:
                    >
                    > >
                    > > If anyone here is not an xml-guru and doesn't code in java, well...
                    > > you can help document and test! Those are also very important tasks!
                    > >
                    >
                    > Hi,
                    > I would not define myself as an 'xml-guru' but I have a good experience in
                    > XML/XSLT and Java programming. I would like to be of help for this great
                    > project.
                    >
                    > Is there any plan to have an i18n version of the PCGen once the migration to
                    > the xml based will be completed? I'm Italian and would be great to have this
                    > application working with meter, Kg and Italian (or other languages)
                    > descriptions.

                    I think the numeric qualities of the items are, well, numbers. I plan
                    to support multiple names for things, and XML supports multibyte
                    characters (accents and stuff), so as far as that goes then i18n will be
                    supported. I don't know that the data files would go so far as to
                    describe multiple ratings for things like weight or length -- it strikes
                    me as a hassle waiting to happen. I think measurements and the like
                    would be stored in a single measurement system (pounds, feet, and inches
                    most likely, because that's what the majority of the books and rules
                    use), but it's not impossible that measurements and the like could be
                    automatically converted for you, though.

                    Supposing a game comes out where the measures do use SI, though, perhaps
                    a flag indicating the unit of measurement would be appropriate. I'll
                    think on that for a while.


                    Keith
                    --
                    Keith Davies
                    keith.davies@...

                    PCGen: <reaper/>, smartass
                    "While information might or might not want to be free, it definitely
                    doesn't want to live under a DRM" -- Jonas, on PCGen
                  • ortheri
                    I m the one leading/doing the i18n of the code. From what I understand, and I ll look into it to make sure. As long at the proper language is selected with
                    Message 9 of 24 , Aug 29, 2002
                      I'm the one leading/doing the i18n of the code. From what I
                      understand, and I'll look into it to make sure. As long at the proper
                      language is selected with the correct lst files, having the numbers in
                      meteric, for instance, the program will recognize this and calculate
                      appropriately. So a number written like 1,000 or 1.000 would be read
                      correctly by the i18n parsers for its language.

                      I plan on getting back to this as soon as the compliance issues are
                      finished.


                      Mario

                      --- In pcgen-xml@y..., Keith Davies <keith.davies@k...> wrote:
                      > On Sat, Aug 24, 2002 at 03:59:29AM +0200, Amedeo Paglione wrote:
                      > > On Friday 23 August 2002 22:12, merton_monk wrote:
                      > >
                      > > >
                      > > > If anyone here is not an xml-guru and doesn't code in java, well...
                      > > > you can help document and test! Those are also very important
                      tasks!
                      > > >
                      > >
                      > > Hi,
                      > > I would not define myself as an 'xml-guru' but I have a good
                      experience in
                      > > XML/XSLT and Java programming. I would like to be of help for this
                      great
                      > > project.
                      > >
                      > > Is there any plan to have an i18n version of the PCGen once the
                      migration to
                      > > the xml based will be completed? I'm Italian and would be great to
                      have this
                      > > application working with meter, Kg and Italian (or other languages)
                      > > descriptions.
                      >
                      > I think the numeric qualities of the items are, well, numbers. I plan
                      > to support multiple names for things, and XML supports multibyte
                      > characters (accents and stuff), so as far as that goes then i18n will be
                      > supported. I don't know that the data files would go so far as to
                      > describe multiple ratings for things like weight or length -- it strikes
                      > me as a hassle waiting to happen. I think measurements and the like
                      > would be stored in a single measurement system (pounds, feet, and inches
                      > most likely, because that's what the majority of the books and rules
                      > use), but it's not impossible that measurements and the like could be
                      > automatically converted for you, though.
                      >
                      > Supposing a game comes out where the measures do use SI, though, perhaps
                      > a flag indicating the unit of measurement would be appropriate. I'll
                      > think on that for a while.
                      >
                      >
                      > Keith
                      > --
                      > Keith Davies
                      > keith.davies@k...
                      >
                      > PCGen: <reaper/>, smartass
                      > "While information might or might not want to be free, it definitely
                      > doesn't want to live under a DRM" -- Jonas, on PCGen
                    • Amedeo Paglione
                      Thanks for your reply. Actually, I was not much worried regarding the format for number etc, but rather to the different metric system and figures for some
                      Message 10 of 24 , Aug 29, 2002
                        Thanks for your reply.
                        Actually, I was not much worried regarding the "format" for number etc, but
                        rather to the different metric system and figures for some rules. Just to
                        give you an example, how would be handled the "base movement" which is 30 ft
                        in the original version and 9 m in the Italian one?

                        Amedeo


                        On Thursday 29 August 2002 23:12, orther wrote:
                        > I'm the one leading/doing the i18n of the code. From what I
                        > understand, and I'll look into it to make sure. As long at the proper
                        > language is selected with the correct lst files, having the numbers in
                        > meteric, for instance, the program will recognize this and calculate
                        > appropriately. So a number written like 1,000 or 1.000 would be read
                        > correctly by the i18n parsers for its language.
                        >
                        > I plan on getting back to this as soon as the compliance issues are
                        > finished.
                        >
                        >
                        > Mario
                        >
                      • David
                        So...for instance... input engine -- linked lists ??? Items within a linked list are easily indexed and referenced. Depending on the efficiency of the
                        Message 11 of 24 , Oct 3, 2002
                          So...for instance...

                          input engine --> linked lists ???

                          Items within a linked list are easily indexed and referenced.
                          Depending on the efficiency of the indexing, you have a virtual
                          database running internally. I've done such a thing with control
                          arrays and other lists in VB as a home-grown web server. Very, very
                          fast (yes, even in VB). ;]

                          dB


                          --- In pcgen-xml@y..., Scott Ellsworth <scott@a...> wrote:
                          >
                          > By making the hard and fast rule that things are referred to by
                          keys,
                          > and that those keys are never, ever parsed more than once, we can
                          get a
                          > LOT of performance. In essence, we should never need a string
                          > tokenizer once the program starts up, and the presence of one
                          should
                          > tell us that we are parsing data that should already be in cooked
                          > format. One benefit of the XML conversion is that we will be
                          creating
                          > those unique identifiers.
                          >
                        • David
                          Has there been anything done on building or creating a utility to convert the .pcc/.lst files? I ve looked at files extensively and concluded after a while
                          Message 12 of 24 , Oct 3, 2002
                            Has there been anything done on building or creating a utility to
                            convert the .pcc/.lst files? I've looked at files extensively and
                            concluded after a while that the time it takes to build a converter
                            may well take longer than simply converting the files manually. I
                            know in this bazarre world of tecnocrats like us, this is a rather
                            absurd idea. As tedious as it may be, it may be one of the most
                            feasible options.

                            dB



                            --- In pcgen-xml@y..., Scott Ellsworth <scott@a...> wrote:
                            > So, the consequence I see is this - we will end up converting
                            classes
                            > of things, and we will want to have an auto-conversion script for
                            the
                            > list files, as otherwise, we get to write both a converter _and_ a
                            > parser for the same data. Less fun than it could be.
                            >
                          • David
                            Star Wars...all metric. dB ... perhaps ... I ll
                            Message 13 of 24 , Oct 3, 2002
                              Star Wars...all metric.

                              dB



                              --- In pcgen-xml@y..., Keith Davies <keith.davies@k...> wrote:
                              > Supposing a game comes out where the measures do use SI, though,
                              perhaps
                              > a flag indicating the unit of measurement would be appropriate.
                              I'll
                              > think on that for a while.
                            • Damian
                              From: David ... The new d20 Mecha game in the latest Dragon also uses metric. Although they simply use a rough 2m=5ft convention.
                              Message 14 of 24 , Oct 3, 2002
                                From: "David" <cyboarg@...>
                                > Star Wars...all metric.

                                The new d20 Mecha game in the latest Dragon also uses metric. Although they
                                simply use a rough 2m=5ft convention.

                                -Damian
                              • Keith Davies
                                ... probably maps (associative arrays) rather than linked lists. Way much faster for lookups, and table scans don t take significantly longer because you re
                                Message 15 of 24 , Oct 3, 2002
                                  On Thu, Oct 03, 2002 at 02:38:43PM +0000, David wrote:
                                  > So...for instance...
                                  >
                                  > input engine --> linked lists ???
                                  >
                                  > Items within a linked list are easily indexed and referenced.
                                  > Depending on the efficiency of the indexing, you have a virtual
                                  > database running internally. I've done such a thing with control
                                  > arrays and other lists in VB as a home-grown web server. Very, very
                                  > fast (yes, even in VB). ;]

                                  probably maps (associative arrays) rather than linked lists. Way much
                                  faster for lookups, and table scans don't take significantly longer
                                  because you're looking at everything anyway.

                                  There's not a lot of sense to storing things ordered, either. Whenever
                                  we display a list it is subject to almost arbitrary ordering by the
                                  user, so maintaining the list in any particular order doesn't have much
                                  point. OTOH, if there are common orderings that are used, or if such
                                  are specified by the user ("always show me spells in name alpha order")
                                  we might cache and maintain a list of references to spells, ordered by
                                  name of the spells... but this is an optimization I think shouldn't be
                                  dealt with right now.


                                  Keith
                                  --
                                  Keith Davies
                                  keith.davies@...

                                  PCGen: <reaper/>, smartass
                                  "You just can't argue with a moron. It's like handling Nuclear
                                  waste. It's not good, it's not evil, but for Christ's sake, don't
                                  get any on you!!" -- Chuck, PCGen mailing list
                                • Keith Davies
                                  ... It s something I have considered, especially given the scope of the changes we ll be making. However, even a simple script that *tries* -- does a first
                                  Message 16 of 24 , Oct 3, 2002
                                    On Thu, Oct 03, 2002 at 02:42:18PM +0000, David wrote:
                                    > Has there been anything done on building or creating a utility to
                                    > convert the .pcc/.lst files? I've looked at files extensively and
                                    > concluded after a while that the time it takes to build a converter
                                    > may well take longer than simply converting the files manually. I
                                    > know in this bazarre world of tecnocrats like us, this is a rather
                                    > absurd idea. As tedious as it may be, it may be one of the most
                                    > feasible options.

                                    It's something I have considered, especially given the scope of the
                                    changes we'll be making. However, even a simple script that *tries* --
                                    does a first pass making decent guesses -- can go a long way to helping
                                    get the data converted.

                                    The actual conversion methods used are something we'll deal with later.
                                    What we need first is something to convert *to*.


                                    Keith
                                    --
                                    Keith Davies
                                    keith.davies@...

                                    PCGen: <reaper/>, smartass
                                    "You just can't argue with a moron. It's like handling Nuclear
                                    waste. It's not good, it's not evil, but for Christ's sake, don't
                                    get any on you!!" -- Chuck, PCGen mailing list
                                  • Scott Ellsworth
                                    ... This matches my experience. Without fail, converting to maps saves oodles of time. We could do that now, save that our keys are in arbitrary case, and
                                    Message 17 of 24 , Oct 3, 2002
                                      On Thursday, October 3, 2002, at 08:27 AM, Keith Davies wrote:

                                      > On Thu, Oct 03, 2002 at 02:38:43PM +0000, David wrote:
                                      >> So...for instance...
                                      >>
                                      >> input engine --> linked lists ???
                                      >>
                                      >
                                      > probably maps (associative arrays) rather than linked lists. Way much
                                      > faster for lookups, and table scans don't take significantly longer
                                      > because you're looking at everything anyway.

                                      This matches my experience. Without fail, converting to maps saves
                                      oodles of time. We could do that now, save that our keys are in
                                      arbitrary case, and there is resistance to changing that.

                                      By having our new data model have a distinct, case sensitive key, but
                                      one that a user never sees, we have fast code, and a good user
                                      experience. In other words, there is only one entry for the stock
                                      fighter class - wotc.dnd.class.fighter or some such. Fighters from
                                      Killer Games would be killergames.class.fighter, and thus we could tell
                                      immediately which is which. If the user wants to see them in the lists
                                      as "Fighter" and "Fighter", we lose nothing, as _we_ and the program
                                      know them as two different things. Further, our UI code is not slowed
                                      down having to see if perhaps the user meant "WoTc.DnD.Class.Fighter".
                                      (Lowercasing on the way in is an option, but I prefer to catch primary
                                      key errors immediately, and deliver an error message.)

                                      > There's not a lot of sense to storing things ordered, either.

                                      When I need to store something in order, I get the keyset from the map,
                                      order that by whatever sort keys matter (usually a single scan, and
                                      very, very fast as I am just swapping a key in an array list), then use
                                      that in my table model.

                                      The code is not that rough - the key is that we NEVER, EVER, EVER need
                                      to resort unless someone adds or subtracts an item from the data model,
                                      or they change the sort key. Even then, that is pretty fast.

                                      I suspect we could do something like this fairly easily for pcgen -
                                      lists become lists of keys into the data store, and sort orders are
                                      just lists of integers in the keyset, or lists of keys in order.

                                      Scott
                                    • STILES, BRAD
                                      ... Are you talking about a permanently defined key, that is part of the data on disk, or one that is derived from the data at runtime? I gather the former,
                                      Message 18 of 24 , Oct 3, 2002
                                        >
                                        > By having our new data model have a distinct, case sensitive key, but
                                        > one that a user never sees, we have fast code, and a good user
                                        > experience. In other words, there is only one entry for the stock
                                        > fighter class - wotc.dnd.class.fighter or some such.

                                        Are you talking about a permanently defined key, that is part of the data on
                                        disk, or one that is derived from the data at runtime? I gather the former,
                                        but just want to make sure.

                                        Brad
                                      • Scott Ellsworth
                                        ... The former. Said key does not have to be english text, but I suspect it should be. It lives in the same space as a database table s primary key - a way
                                        Message 19 of 24 , Oct 3, 2002
                                          On Thursday, October 3, 2002, at 11:12 AM, STILES, BRAD wrote:

                                          >> By having our new data model have a distinct, case sensitive key, but
                                          >> one that a user never sees, we have fast code, and a good user
                                          >> experience. In other words, there is only one entry for the stock
                                          >> fighter class - wotc.dnd.class.fighter or some such.
                                          >
                                          > Are you talking about a permanently defined key, that is part of the
                                          > data on disk, or one that is derived from the data at runtime? I
                                          > gather the former, but just want to make sure.

                                          The former. Said key does not have to be english text, but I suspect
                                          it should be. It lives in the same space as a database table's primary
                                          key - a way to insure that when something refers to a table entry, it
                                          gets exactly the entry it expected.

                                          Again, it does not really show up in user space - it is a data space
                                          thing that file editors or other storage systems see.

                                          Scott
                                        • STILES, BRAD
                                          ... Don t you run the risk of the key changing then, if the data upon which it s based changes? That means that some other piece of data that refers to it,
                                          Message 20 of 24 , Oct 3, 2002
                                            > >
                                            > > Are you talking about a permanently defined key, that is
                                            > > part of the data on disk, or one that is derived from the
                                            > > data at runtime?
                                            >
                                            > In the file. Derived at runtime is basically how it works now.

                                            Don't you run the risk of the key changing then, if the data upon which it's
                                            based changes? That means that some other piece of data that refers to it,
                                            such as a character, won't be able to find it.

                                            Brad
                                          • Keith Davies
                                            ... In the file. Derived at runtime is basically how it works now. Keith -- Keith Davies keith.davies@kjdavies.org PCGen: , smartass You just can t
                                            Message 21 of 24 , Oct 3, 2002
                                              On Thu, Oct 03, 2002 at 02:12:43PM -0400, STILES, BRAD wrote:
                                              > >
                                              > > By having our new data model have a distinct, case sensitive key, but
                                              > > one that a user never sees, we have fast code, and a good user
                                              > > experience. In other words, there is only one entry for the stock
                                              > > fighter class - wotc.dnd.class.fighter or some such.
                                              >
                                              > Are you talking about a permanently defined key, that is part of the data on
                                              > disk, or one that is derived from the data at runtime? I gather the former,
                                              > but just want to make sure.

                                              In the file. Derived at runtime is basically how it works now.


                                              Keith
                                              --
                                              Keith Davies
                                              keith.davies@...

                                              PCGen: <reaper/>, smartass
                                              "You just can't argue with a moron. It's like handling Nuclear
                                              waste. It's not good, it's not evil, but for Christ's sake, don't
                                              get any on you!!" -- Chuck, PCGen mailing list
                                            • Keith Davies
                                              ... The main IDs -- those of game-level objects -- would be more or less human text. Such as class.fighter and spell.fireball . There would be others,
                                              Message 22 of 24 , Oct 3, 2002
                                                On Thu, Oct 03, 2002 at 11:30:03AM -0700, Scott Ellsworth wrote:
                                                >
                                                > On Thursday, October 3, 2002, at 11:12 AM, STILES, BRAD wrote:
                                                >
                                                > >> By having our new data model have a distinct, case sensitive key, but
                                                > >> one that a user never sees, we have fast code, and a good user
                                                > >> experience. In other words, there is only one entry for the stock
                                                > >> fighter class - wotc.dnd.class.fighter or some such.
                                                > >
                                                > > Are you talking about a permanently defined key, that is part of the
                                                > > data on disk, or one that is derived from the data at runtime? I
                                                > > gather the former, but just want to make sure.
                                                >
                                                > The former. Said key does not have to be english text, but I suspect
                                                > it should be. It lives in the same space as a database table's primary
                                                > key - a way to insure that when something refers to a table entry, it
                                                > gets exactly the entry it expected.

                                                The main IDs -- those of game-level objects -- would be more or less
                                                human text. Such as 'class.fighter' and 'spell.fireball'. There would
                                                be others, though, that wouldn't (such as a character's possession list
                                                -- ID's may be used to ensure that multiple equipment sets refer to the
                                                same things). OTOH, that could be done by just checking the totals
                                                against the possessions. I own four potions of healing; as long as no
                                                equipment set contains more than four, I'm fine. That might end up
                                                simpler, and greatly reduce the indexes.

                                                > Again, it does not really show up in user space - it is a data space
                                                > thing that file editors or other storage systems see.

                                                They'd be available for view during data entry -- when I add a weapon,
                                                the software might guess at the ID but a human should be the one to
                                                decide (or at least confirm it. After that, though, normal users should
                                                never need to see it.


                                                Keith
                                                --
                                                Keith Davies
                                                keith.davies@...

                                                PCGen: <reaper/>, smartass
                                                "You just can't argue with a moron. It's like handling Nuclear
                                                waste. It's not good, it's not evil, but for Christ's sake, don't
                                                get any on you!!" -- Chuck, PCGen mailing list
                                              • Keith Davies
                                                ... The IDs, once assigned, should never change. weapon.dagger always refers to the same thing. Given that we re talking about having a formal check-in for
                                                Message 23 of 24 , Oct 3, 2002
                                                  On Thu, Oct 03, 2002 at 02:32:18PM -0400, STILES, BRAD wrote:
                                                  > > >
                                                  > > > Are you talking about a permanently defined key, that is
                                                  > > > part of the data on disk, or one that is derived from the
                                                  > > > data at runtime?
                                                  > >
                                                  > > In the file. Derived at runtime is basically how it works now.
                                                  >
                                                  > Don't you run the risk of the key changing then, if the data upon which it's
                                                  > based changes? That means that some other piece of data that refers to it,
                                                  > such as a character, won't be able to find it.

                                                  The IDs, once assigned, should never change. 'weapon.dagger' always
                                                  refers to the same thing. Given that we're talking about having a
                                                  formal check-in for data files, they should be changing (by us) anyway
                                                  -- if a campaign needs a minor change to something, it should load the
                                                  original and MOD it.

                                                  IDs should never change.


                                                  Keith
                                                  --
                                                  Keith Davies
                                                  keith.davies@...

                                                  PCGen: <reaper/>, smartass
                                                  "You just can't argue with a moron. It's like handling Nuclear
                                                  waste. It's not good, it's not evil, but for Christ's sake, don't
                                                  get any on you!!" -- Chuck, PCGen mailing list
                                                • Keith Davies
                                                  ... simple workaround - smash case on load. Everything gets keyed by upper or lower case value, but displays the label in original case. Thus, when loading
                                                  Message 24 of 24 , Oct 8, 2002
                                                    On Thu, Oct 03, 2002 at 10:53:04AM -0700, Scott Ellsworth wrote:
                                                    >
                                                    > On Thursday, October 3, 2002, at 08:27 AM, Keith Davies wrote:
                                                    >
                                                    > > On Thu, Oct 03, 2002 at 02:38:43PM +0000, David wrote:
                                                    > >> So...for instance...
                                                    > >>
                                                    > >> input engine --> linked lists ???
                                                    > >>
                                                    > >
                                                    > > probably maps (associative arrays) rather than linked lists. Way much
                                                    > > faster for lookups, and table scans don't take significantly longer
                                                    > > because you're looking at everything anyway.
                                                    >
                                                    > This matches my experience. Without fail, converting to maps saves
                                                    > oodles of time. We could do that now, save that our keys are in
                                                    > arbitrary case, and there is resistance to changing that.

                                                    simple workaround - smash case on load. Everything gets keyed by upper
                                                    or lower case value, but displays the label in original case. Thus,
                                                    when loading weapons.lst you add to the Weapons map, and whenever you
                                                    look for something you first smash its case. 'course, you'd probably
                                                    implement the Map interface, then use the specified Map subtype
                                                    internally. That is,

                                                    class SmashMap : implements Map {
                                                    Map mymap;
                                                    SmashMap() {
                                                    mymap = new HashMap();
                                                    }
                                                    SmashMap( Map map) {
                                                    mymap = map;
                                                    // or mymap = map.clone(); if you want a copy
                                                    }
                                                    void insert( key, object) {
                                                    mymap.insert( smash( key), object);
                                                    }
                                                    Object find( key) {
                                                    return mymap.find( smash( key));
                                                    }
                                                    }

                                                    (note: Java syntax and standard class structure may be incorrect; I'm
                                                    making this up as I go.... It should be clear what I meant, though)

                                                    This allows the files to remain in their current state, but internally
                                                    the program can use better values. A possible improvement would be to
                                                    modify the read code so that when a reference value is found (such as
                                                    PREFERREDCLASS in RACE -- again making it up as I go) you smash case
                                                    *there*, instead of when you try to resolve it.

                                                    Between the two changes, you could switch to a Map-based internal
                                                    structure and save yourself a bunch of slow lookups.

                                                    > By having our new data model have a distinct, case sensitive key, but
                                                    > one that a user never sees, we have fast code, and a good user
                                                    > experience. In other words, there is only one entry for the stock
                                                    > fighter class - wotc.dnd.class.fighter or some such. Fighters from
                                                    > Killer Games would be killergames.class.fighter, and thus we could tell
                                                    > immediately which is which. If the user wants to see them in the lists
                                                    > as "Fighter" and "Fighter", we lose nothing, as _we_ and the program
                                                    > know them as two different things. Further, our UI code is not slowed
                                                    > down having to see if perhaps the user meant "WoTc.DnD.Class.Fighter".
                                                    > (Lowercasing on the way in is an option, but I prefer to catch primary
                                                    > key errors immediately, and deliver an error message.)

                                                    Ah, you did mention it. It's not my preferred answer either -- trapping
                                                    errors and giving warnings as soon as possible makes for better
                                                    happiness.

                                                    So, I agree entirely with that last paragraph.

                                                    > > There's not a lot of sense to storing things ordered, either.
                                                    >
                                                    > When I need to store something in order, I get the keyset from the map,
                                                    > order that by whatever sort keys matter (usually a single scan, and
                                                    > very, very fast as I am just swapping a key in an array list), then use
                                                    > that in my table model.
                                                    >
                                                    > The code is not that rough - the key is that we NEVER, EVER, EVER need
                                                    > to resort unless someone adds or subtracts an item from the data model,
                                                    > or they change the sort key. Even then, that is pretty fast.

                                                    Once the load is complete, this shouldn't happen too often unless the
                                                    user is specifically modifying that information (for instance, working
                                                    on a file of weapon descriptions). Lazy evaluation should work well
                                                    enough -- keep keying new items in as fast as possible, then sort it
                                                    when the list gets looked at again. Thus,

                                                    user clicks 'add weapons' button
                                                    system displays 'add weapon form'
                                                    until (close form) {
                                                    while (user keys weapon description and doesn't press button) {
                                                    system enable/disable commit button based on validation results
                                                    }
                                                    if (user pressed commit) {
                                                    system stores weapon description
                                                    system clears form
                                                    } elsif (user pressed cancel) {
                                                    system prompts for confirmation[1]
                                                    if (confirmed) {
                                                    discard entered values
                                                    close form = true
                                                    }
                                                    }
                                                    }
                                                    system closes/frees 'add weapon form'

                                                    Or something similar. It probably isn't necessary to sort the data each
                                                    time, just when the form closes or the list is otherwise displayed in
                                                    full.

                                                    And damn, but it's awkward to try to write event-driven behavior
                                                    algorithmically....

                                                    > I suspect we could do something like this fairly easily for pcgen -
                                                    > lists become lists of keys into the data store, and sort orders are
                                                    > just lists of integers in the keyset, or lists of keys in order.

                                                    Sounds right.


                                                    Keith
                                                    --
                                                    Keith Davies
                                                    keith.davies@...

                                                    PCGen: <reaper/>, smartass
                                                    "You just can't argue with a moron. It's like handling Nuclear
                                                    waste. It's not good, it's not evil, but for Christ's sake, don't
                                                    get any on you!!" -- Chuck, PCGen mailing list
                                                  Your message has been successfully submitted and would be delivered to recipients shortly.