Loading ...
Sorry, an error occurred while loading the content.

Re: [pcgen_international] PCGen 6.0 I18N Requirements Gathering

Expand Messages
  • Tom Parker
    boomer70 wrote: Hi all, Hey! Good to hear from you By Constants you mean they are hardcoded in the code? Obviously these do need to be
    Message 1 of 11 , Jun 27, 2007
    • 0 Attachment
      boomer70 <boomer70@...> wrote: Hi all,

      Hey! Good to hear from you

      By Constants you mean they are "hardcoded" in the
      code? Obviously these do need to be translated.
      Something like that (although they aren't technically hardcoded, but dynamically built lists). By constants, I mean that in the CDOM branch they are actually typesafe constants in some fashion (see pcgen.cdom.enumeration)

      You can also add some information that is stored in
      the various TYPE: tags. For example Fighter feats
      should be translated.
      So the question here is whether the TYPE is ever actually output to the user. I'll have to go check, because I honestly don't recall a situation where that happens. Of course, that's probably not the only thing I missed.

      > Note that PObjects include...

      Also includes CHECKS (saves).
      Ah, yes, missed that.

      I am not sure we really want to assume that OUTPUTNAME
      will be used for translation.
      I think since we are looking to gather requirements we
      shouldn't make any asumptions at this point.
      Good point. I guess the intent would be to indicate that the base name of a PObject must also be translated, even though there isn't an associated "token" to go along with the primary name.

      > Equipment "CONTAINS"
      Not sure what you mean here.

      CONTAINS:100|Potion=10 (or whatever the syntax is)

      Does "Potion" need translation, or is it simply a rule. I'm unsure whether CONTAINS is ever output in human readable form, so that's why I was listing it as "unsure"



      --
      Tom Parker
      thpr@... and tppublic@...

      ---------------------------------
      Don't be flakey. Get Yahoo! Mail for Mobile and
      always stay connected to friends.

      [Non-text portions of this message have been removed]
    • Alvise Nicoletti
      ... Ok, and why should a skill change? We can t translate things if someone changes their names. WOTC don t write manual with skill ID as a primary key, but
      Message 2 of 11 , Jun 27, 2007
      • 0 Attachment
        > I do not believe we can dynamically translate everything in the sense that we are not
        > Babelfish. There will have to be some form of listing that X translates into Y. If a Skill
        > changes from "Dodge" to "Get out of the Way", we can't expect a translation system to
        > automatically make that update. (If we were doing that, we'd be building a translator, not
        > a character generator)

        Ok, and why should a skill change? We can't translate things if
        someone changes their names.

        WOTC don't write manual with "skill ID" as a primary key, but we can
        identify everything from the "common word" used for that thing.


        > I *do* believe we should have the ability to:
        >
        > 1) Define that Skill "Dodge" is actually "Detour" (or whatever it is in your local language).

        Thas is was I just wrote.

        >
        > 2) Have that translation should work in future versions of PCGen (meaning the translations are detached in some fashion from the rules & associations defined in the data files, so as the rules change, the translations work properly)

        Yes, and "abstracting" the dataset concept with a dictionary database
        is the only way you will do this, without doing yourself a dataset for
        every language.

        >
        > 3) Have the datasets and PCs language-independent enough that a PC could be built in an Italian instance of PCGen and displayed in a French instance and be displayed fully in French

        That depends on what PcGen load (datasets) and someone already told
        that the SOURCES translation is a different stuff between the IDE
        translation.

        If PcGen loads french datasets, the result will be in french for a
        viewer (of course if a "label" variable is defined for the objects)

        >
        > 4) Be able to detect and report any items which were not provided a translation, but still fail gracefully back to the default (American English?)
        >
        > 5) Be able to detect changes in the base language (effectively American English) between versions of PCGen to "flag" areas where translation may be required due to naming changes
        >
        > I believe those 5 items summarize how I would interpert "translate dynamically the dataset". Does that meet with the intent of your statement/concern?
        >
        > Thanks.
        >
        > TP.
        >
        > --
        > Tom Parker
        > thpr@... and tppublic@...
        >
        > ---------------------------------
        > Expecting? Get great news right away with email Auto-Check.
        > Try the Yahoo! Mail Beta.
        >
        > [Non-text portions of this message have been removed]
        >
        >
      • Tom Parker
        Alvise Nicoletti wrote:Ok, and why should a skill change? We can t translate things if someone changes their names. Well,
        Message 3 of 11 , Jun 27, 2007
        • 0 Attachment
          Alvise Nicoletti <alvise.nicoletti@...> wrote:Ok, and why should a skill change? We can't translate things if
          someone changes their names.
          Well, technically, you can, it just makes it really hard. The change is something that needs to be detected and handled.

          However, it's just an example. Things *shouldn't* change, but that doesn't mean one doesn't consider what happens in case it does change. I spent many years trying to break software, and some of my education was from a professor that programmed heart/lung machines, so I tend to be very methodical about considering "what ifs", even if they are unlikely to occur.
          > 2) Have that translation should work in future versions of PCGen (meaning the translations are detached in some fashion from the rules & associations defined in the data files, so as the rules change, the translations work properly)

          Yes, and "abstracting" the dataset concept with a dictionary database
          is the only way you will do this, without doing yourself a dataset for
          every language.
          I agree it needs to be abstracted, but there are multiple ways of implementing such a dictionary. Thus I am trying to ensure we are capturing the requirements, so that the system that is built meets the requirements.
          > 3) Have the datasets and PCs language-independent enough that a PC could be built in an Italian instance of PCGen and displayed in a French instance and be displayed fully in French

          That depends on what PcGen load (datasets) and someone already told
          that the SOURCES translation is a different stuff between the IDE
          translation.

          If PcGen loads french datasets, the result will be in french for a
          viewer (of course if a "label" variable is defined for the objects)
          Yes and no. If the datasets are properly built, then the datasets could use the same localization as the UI (or different localization if someone wanted preferences to do that). The point is that I shouldn't have to load French datasets to open a PC that was built with French datasets. I should be able to load the American English datasets and debug it in American English (or you could share an NPC with me and I could print it entirely in English). [Of course, internationalization items may not be debuggable like that, but non-i18n stuff would be]

          That's why I want to get the requirements properly stated - I do not believe that building datasets for each language is the answer, and I also don't want it to be simply creating multiple dependent datasets (which could be loaded separately) by applying a database to a master data file (because that prevents sharing characters across language boundaries).

          TP.


          --
          Tom Parker
          thpr@... and tppublic@...

          ---------------------------------
          Finding fabulous fares is fun.
          Let Yahoo! FareChase search your favorite travel sites to find flight and hotel bargains.

          [Non-text portions of this message have been removed]
        • Ludovic Fierville
          We have a not-so-bad solution right now, though it needs some polishing. IMHO, the best solution is to keep the master datasets in english, because the data
          Message 4 of 11 , Jun 27, 2007
          • 0 Attachment
            We have a not-so-bad solution right now, though it needs some polishing.

            IMHO, the best solution is to keep the "master" datasets in english,
            because the data team is mainly english-speaking, and it's quite
            fairly understood by RPG people (I think).

            Keeping these datasets, we could create "translating" datasets in
            which we would .MOD the existing elements. I actually created such a
            (incomplete) set in french by using .MOD and adding an OUTPUTNAME in
            french. Works quite well, though I detected some small bugs in the
            UI, and the OS is still in english (for "hardcoded" parts).

            I don't think creating whole translated datasets is a good idea. Even
            by using an intermediate database as Alvise suggets, we would have to
            recreate the sets each time a change is made (in the "original"
            dataset or in the translating database).

            OUTPUTNAME is not perfect, but it's not so bad. If we want to improve
            it, we could create a new tag, for example TRANSLATE. This tag would
            be used in the UI and on the OS, "replacing" the english name with
            the translated one. The main point here is to let PCGen use the
            english name internally, and only use the translated one where the
            user can see the result.

            What we would need :
            - a translating file for the UI. Quite simple, there aren't much
            translatings to do (some are actually done, though are hardcoded AFAIK)
            - translatings datasets ; I have coded one (you can find it in the
            Files section of this group), the datasets are loaded with the other
            datasets and .MOD them as needed. We could add an option to tell
            PCGen we want french or italian or esperanto or whatever and it would
            load the necessary files (they would have to be put in a special place)

            If we want to go all the way, we would need to partially translate
            the docs, but I don't think we are ready for this tough cookie.

            So, to summarize :
            - a way to translate the UI, for example by using variables in the UI
            code which would be replaced by values extracted from a file
            - a way to translate dataset info that the end user can see

            It seems natural to fall back to english if something fails to load
            or is incomplete (lack of translating data -> use original data,
            seems quite simple)

            Tom, I think you have summarized the tags which would need a
            translation. I'm not a coder, so I'm not sure if the list is
            complete, but we could give it a try and complete the list while the
            betas go.

            Ludo


            Le 27 juin 07 à 20:49, Tom Parker a écrit :

            > Alvise Nicoletti <alvise.nicoletti@...> wrote:Ok, and why
            > should a skill change? We can't translate things if
            > someone changes their names.
            > Well, technically, you can, it just makes it really hard. The
            > change is something that needs to be detected and handled.
            >
            > However, it's just an example. Things *shouldn't* change, but that
            > doesn't mean one doesn't consider what happens in case it does
            > change. I spent many years trying to break software, and some of
            > my education was from a professor that programmed heart/lung
            > machines, so I tend to be very methodical about considering "what
            > ifs", even if they are unlikely to occur.
            >> 2) Have that translation should work in future versions of PCGen
            >> (meaning the translations are detached in some fashion from the
            >> rules & associations defined in the data files, so as the rules
            >> change, the translations work properly)
            >
            > Yes, and "abstracting" the dataset concept with a dictionary database
            > is the only way you will do this, without doing yourself a dataset for
            > every language.
            > I agree it needs to be abstracted, but there are multiple ways of
            > implementing such a dictionary. Thus I am trying to ensure we are
            > capturing the requirements, so that the system that is built meets
            > the requirements.
            >> 3) Have the datasets and PCs language-independent enough that a
            >> PC could be built in an Italian instance of PCGen and displayed in
            >> a French instance and be displayed fully in French
            >
            > That depends on what PcGen load (datasets) and someone already told
            > that the SOURCES translation is a different stuff between the IDE
            > translation.
            >
            > If PcGen loads french datasets, the result will be in french for a
            > viewer (of course if a "label" variable is defined for the objects)
            > Yes and no. If the datasets are properly built, then the datasets
            > could use the same localization as the UI (or different
            > localization if someone wanted preferences to do that). The point
            > is that I shouldn't have to load French datasets to open a PC that
            > was built with French datasets. I should be able to load the
            > American English datasets and debug it in American English (or you
            > could share an NPC with me and I could print it entirely in
            > English). [Of course, internationalization items may not be
            > debuggable like that, but non-i18n stuff would be]
            >
            > That's why I want to get the requirements properly stated - I do
            > not believe that building datasets for each language is the answer,
            > and I also don't want it to be simply creating multiple dependent
            > datasets (which could be loaded separately) by applying a database
            > to a master data file (because that prevents sharing characters
            > across language boundaries).
            >
            > TP.
            >
            >
            > --
            > Tom Parker
            > thpr@... and tppublic@...
            >
            > ---------------------------------
            > Finding fabulous fares is fun.
            > Let Yahoo! FareChase search your favorite travel sites to find
            > flight and hotel bargains.
            >
            > [Non-text portions of this message have been removed]
            >
            >
            >
            >
            > Yahoo! Groups Links
            >
            >
            >

            Ludovic Fierville
            lfierville<at>gmail<dot>com

            ~ Tengoku de omachisi te imasu ~
          • Tom Parker
            Ludovic Fierville wrote:IMHO, the best solution is to keep the master datasets in english, because the data team is mainly
            Message 5 of 11 , Jun 27, 2007
            • 0 Attachment
              Ludovic Fierville <lfierville@...> wrote:IMHO, the best solution is to keep the "master" datasets in english,
              because the data team is mainly english-speaking, and it's quite
              fairly understood by RPG people (I think).

              I agree with that general concept.

              Keeping these datasets, we could create "translating" datasets in
              which we would .MOD the existing elements. I actually created such a
              (incomplete) set in french by using .MOD and adding an OUTPUTNAME in
              french. Works quite well, though I detected some small bugs in the
              UI, and the OS is still in english (for "hardcoded" parts).

              This works for some items, but will not work well for others. We need something with a bit more power than .MOD in order to handle things like Equipment Qualities, DESC tokens, and the Constants (e.g. Spell Schools). Given that, it probably makes sense to keep everything handled in a common fashion.

              I don't think creating whole translated datasets is a good idea. Even
              by using an intermediate database as Alvise suggets, we would have to
              recreate the sets each time a change is made (in the "original"
              dataset or in the translating database).

              I agree. The "creation" should occur when the code imports the LST files into PCGen.

              The main point here is to let PCGen use the
              english name internally, and only use the translated one where the
              user can see the result.

              Something to that effect is my suspicion on how it will work (and what is currently in my mind), but I'm trying to validate requirements to ensure that will actually work :)





              --
              Tom Parker
              thpr@... and tppublic@...

              ---------------------------------
              The fish are biting.
              Get more visitors on your site using Yahoo! Search Marketing.

              [Non-text portions of this message have been removed]
            • Aubé Philippe
              Hi all, I think we should bear in mind that : 1. All that is to be displayed should be translated. 2. Everything that is to be displayed should not be used
              Message 6 of 11 , Jun 27, 2007
              • 0 Attachment
                Hi all,

                I think we should bear in mind that :

                1. All that is to be displayed should be translated.
                2. Everything that is to be displayed should not be used internally.
                3. There should be as few files as possible and the link between initial
                file and translation "holder" should be clear enough for everyone to be
                able to modify the translation easily.
                4. Changing tags should not affect the translation.
                5. Sorting should take into account accents.

                Let me be clearer on points 1 and 2.

                Spells have an awful long list of factors, most need to be displayed AND
                are used internally. There should be no such thing. The duration of a
                spell should be in two ratings : the duration as used internally and the
                duration as displayed. Am I clear ?

                Maybe it would be a good idea to plit the files in several files : one
                file holding internal information, another one holding displayed
                information along with the translations (maybe one file per language).
                Each new PCGen version should simply modify the "internal information"
                part, not the translation one. When the translation doesn't exist, then
                the english displayed information should be used.
              • Tom Parker
                ... That is why I m trying to gather a list of what needs to be displayed or output. While I can take a pass from what I see in the code and character sheets,
                Message 7 of 11 , Jun 29, 2007
                • 0 Attachment
                  --- In pcgen_international@yahoogroups.com, Aubé Philippe
                  <tymophil@...> wrote:
                  > I think we should bear in mind that :
                  >
                  > 1. All that is to be displayed should be translated.

                  That is why I'm trying to gather a list of what needs to be displayed
                  or output. While I can take a pass from what I see in the code and
                  character sheets, an exhaustive code search is time consuming, and a
                  few more sets of eyes and some experience with translation isn't
                  experience I want to ignore.

                  > 2. Everything that is to be displayed should not be used internally.

                  I disagree. There are some items (such as height) that could be
                  stored internally in arbitrary units, but converted to meters or
                  feet/inches only when displayed. This would require a database of
                  translations for meter, etc., but is likely preferable to forcing each
                  locality to translate "2 meters" into their local dialect.

                  > 3. There should be as few files as possible and the link between
                  initial
                  > file and translation "holder" should be clear enough for everyone to be
                  > able to modify the translation easily.

                  ...while at the same time not creating so much confusion for someone
                  creating a custom dataset that they have to do contortions to get a
                  personal dataset in only their home language.

                  > 4. Changing tags should not affect the translation.

                  Agreed

                  > 5. Sorting should take into account accents.

                  A GUI and Output Sheet item, already FREQed, and outside the scope of
                  my current inquiry, but I agree.

                  > Let me be clearer on points 1 and 2.
                  >
                  > Spells have an awful long list of factors, most need to be displayed
                  AND
                  > are used internally. There should be no such thing. The duration of a
                  > spell should be in two ratings : the duration as used internally and
                  the
                  > duration as displayed. Am I clear ?

                  I get the point; however, many of the spells items at the moment are
                  just Strings, so this isn't a concern.

                  > Maybe it would be a good idea to plit the files in several files : one
                  > file holding internal information, another one holding displayed
                  > information along with the translations (maybe one file per language).
                  > Each new PCGen version should simply modify the "internal information"
                  > part, not the translation one. When the translation doesn't exist, then
                  > the english displayed information should be used.

                  I think this is a bad idea, because it makes simple home-brew data
                  unnecessarily complicated by introducing multiple files. I believe
                  there are alternative solutions that do not require this complexity.
                  I'll explain more once I get a chance to dig up Aaron's previous
                  suggestions and make sure I'm not missing anything.

                  TP.
                Your message has been successfully submitted and would be delivered to recipients shortly.