Loading ...
Sorry, an error occurred while loading the content.
 

Re: [JSX] JSX on Java 1.7: issue with java.util.Locale

Expand Messages
  • Brendan Macmillan
    Hi Joël, ... Great, thanks for that. I m pretty sure that that fix is right; I ve gone through the JOS specification now; and tried several test cases; and
    Message 1 of 12 , Oct 3, 2012
      Hi Joël,

      > I also tried to serialize the result and then deserialize it back, and
      > it looks fine. If you need me to make additional trials, let me know!

      Great, thanks for that. I'm pretty sure that that fix is right; I've
      gone through the JOS specification now; and tried several test cases;
      and both confirm it. I still need to check for interactions with the
      rest of the code base.

      Just to confirm - this isn't urgent for you? It might be best for me
      to spend a few days on it.


      cheers,
      Brendan

      > On Wed, 2012-10-03 at 02:48 +1000, Brendan Macmillan wrote:
      >>
      >> Hi Joël,
      >>
      >> I haven't thoroughly checked the JOS spec yet, but I have a trial
      >> bugfix, which works for my reproduction of the issue - would you mind
      >> getting some info from your codebase please, as a preliminary step?
      >> The jar is on this page:
      >>
      >> http://tech.groups.yahoo.com/group/JSX-ideas/files/Locale_test_bugfix/
      >>
      >> It should print out the 5 fields involved in Locale when it
      >> deserializes (and only once), like this:
      >> 3/10/2012 BUGFIX virtual field: language
      >> 3/10/2012 BUGFIX virtual field: country
      >> 3/10/2012 BUGFIX virtual field: variant
      >> 3/10/2012 BUGFIX virtual field: hashcode
      >> 3/10/2012 BUGFIX virtual field: script
      >> 3/10/2012 BUGFIX virtual field: extensions
      >>
      >> Please let me know if there any other lines with BUGFIX.
      >> And of course, hopefully there aren't any exceptions.
      >>
      >> Note: this is just a temp experiment, I still need to do more tests
      >> and go through the JOS spec to confirm it is correct.
      >>
      >> cheers,
      >> Brendan
      >>
      >>
      >>
      >>
      >
      >
      >
      >
      > ------------------------------------
      >
      > Yahoo! Groups Links
      >
      >
      >
      >
    • Joël Bourquard
      Hi Brendan, It s a pleasure! I m glad that the first fix is working. Well done :-) That s not urgent because we need to fix the other issues we have with Java
      Message 2 of 12 , Oct 4, 2012
        Hi Brendan,

        It's a pleasure! I'm glad that the first fix is working. Well done :-)

        That's not urgent because we need to fix the other issues we have with
        Java 7 as well, but if you could look into it some time within the next
        10 days or so, it would be awesome.

        Thank you!

        Joël



        On Thu, 2012-10-04 at 06:34 +1000, Brendan Macmillan wrote:
        >
        > Hi Joël,
        >
        > > I also tried to serialize the result and then deserialize it back,
        > and
        > > it looks fine. If you need me to make additional trials, let me
        > know!
        >
        > Great, thanks for that. I'm pretty sure that that fix is right; I've
        > gone through the JOS specification now; and tried several test cases;
        > and both confirm it. I still need to check for interactions with the
        > rest of the code base.
        >
        > Just to confirm - this isn't urgent for you? It might be best for me
        > to spend a few days on it.
        >
        > cheers,
        > Brendan
        >
        > > On Wed, 2012-10-03 at 02:48 +1000, Brendan Macmillan wrote:
        > >>
        > >> Hi Joël,
        > >>
        > >> I haven't thoroughly checked the JOS spec yet, but I have a trial
        > >> bugfix, which works for my reproduction of the issue - would you
        > mind
        > >> getting some info from your codebase please, as a preliminary step?
        > >> The jar is on this page:
        > >>
        > >>
        > http://tech.groups.yahoo.com/group/JSX-ideas/files/Locale_test_bugfix/
        > >>
        > >> It should print out the 5 fields involved in Locale when it
        > >> deserializes (and only once), like this:
        > >> 3/10/2012 BUGFIX virtual field: language
        > >> 3/10/2012 BUGFIX virtual field: country
        > >> 3/10/2012 BUGFIX virtual field: variant
        > >> 3/10/2012 BUGFIX virtual field: hashcode
        > >> 3/10/2012 BUGFIX virtual field: script
        > >> 3/10/2012 BUGFIX virtual field: extensions
        > >>
        > >> Please let me know if there any other lines with BUGFIX.
        > >> And of course, hopefully there aren't any exceptions.
        > >>
        > >> Note: this is just a temp experiment, I still need to do more tests
        > >> and go through the JOS spec to confirm it is correct.
        > >>
        > >> cheers,
        > >> Brendan
        > >>
        > >>
        > >>
        > >>
        > >
        > >
        > >
        > >
        > > ------------------------------------
        > >
        > > Yahoo! Groups Links
        > >
        > >
        > >
        > >
        >
        >
        >
        >
      • Brendan Macmillan
        Hi Joël, Thanks for being so understanding, and having a specific time frame. That s helpful for scheduling. I thought it would be useful to record the
        Message 3 of 12 , Oct 4, 2012
          Hi Joël,

          Thanks for being so understanding, and having a specific time frame.
          That's helpful for scheduling.

          I thought it would be useful to record the problem and cause below, to
          help clarify it.
          You might find it interesting - but please don't feel any pressure to
          read it. It's long.

          cheers,
          Brendan


          I'd like to explain the issue, in these steps:
          1. the evolution of Locale between Java 1.6 and 1.7
          2. how JOS handles that class evolution
          3. how JSX handled it differently from JOS - that is, the bug
          4. finally, why this problem hasn't come up before in the 10
          commercial years of JSX


          1. Class evolution of Locale:
          Two fields were added to Locale (script and extensions) - but not in
          the way you would expect...

          The old Locale had 4 serializable fields (that is, non-static and
          non-transient):
          language
          country
          variant
          hashcode

          But the new Locale doesn't have any such fields! Instead, it defines 6
          serializable fields, like this:
          private static final ObjectStreamField[] serialPersistentFields = {
          new ObjectStreamField("language", String.class),
          new ObjectStreamField("country", String.class),
          new ObjectStreamField("variant", String.class),
          new ObjectStreamField("hashcode", int.class),
          new ObjectStreamField("script", String.class), // <-- new
          new ObjectStreamField("extensions", String.class), // <-- new
          };

          I think of these as "virtual fields" (meaning fake, faux, pretend),
          but that's not an official name. They are serialized by JOS and JSX
          exactly as if they were actual fields, so you can't tell they are
          virtual just by looking at the XML. They enable a class to pretend to
          have different fields. They are a compatibility layer, enabling you to
          change the actual fields of class, while still making it act as if it
          had the old fields. In other words, the serialized form is a kind of
          public API or interface; these virtual fields give you a loose
          coupling between that public interface and the implementation details.
          It's a kind of "information hiding", as in encapsulation, Abstract
          Data Types, and Parnas' paper on module decomposition
          http://en.wikipedia.org/wiki/Information_hiding#History

          [ BTW: So, if Locale lacks these fields, how does it store its
          information? It uses a BaseLocale object (sun.util.locale.BaseLocale)
          in a transient field, so it's not serialized. Being in the "sun"
          hierarchy, I thought source wasn't available, but here's the openJDK
          implementation:
          http://www.docjar.com/html/api/sun/util/locale/BaseLocale.java.html ]


          2. JOS and class evolution
          The problem occurs when XML serialized from the 1.6 Locale, is
          deserialized into the 1.7 Locale.

          Although the new Locale doesn't have any actual instance fields, it
          has specified its virtual fields with spf (above). That is, Locale is
          pretending that it has those 6 fields listed above. Note that these 6
          fields are stating what fields Locale has - not necessarily the fields
          in the serialized data.

          Before looking at evolution, let's consider the simpler case, where
          all 6 expected fields were serialized. How can Locale deserialize
          them, when it doesn't actually have those fields, to deserialize those
          values into? It has a special "readObject" method that JOS calls, in
          which these lines read in the serialized values:
          ObjectOutputStream.PutField fields = out.putFields();
          String language = (String)fields.get("language", "");
          String script = (String)fields.get("script", "");
          String country = (String)fields.get("country", "");
          String variant = (String)fields.get("variant", "");
          String extStr = (String)fields.get("extensions", "");

          For example, the first line get()s the "language" value from the
          serialized data, and puts it into a local variable, also named
          language.

          The second argument of "get()" is a default value, which "get()" will
          return if that field was not present in the serialized data.

          [ BTW: You might notice there are only 5 fields here, not 6 as in the
          spf. Although both new fields are present ("script" and "extension"),
          "hashcode" is not. This is because the 1.7 Locale doesn't use
          "hashcode", and so just ignores it. This does no harm, because the
          "putFields()" has already parsed all the serialized fields, into
          fields (which acts like a Map). So the get()s are just getting from a
          Map, and not parsing, thus it doesn't matter whether it read all or
          any of the values. (The reason "hashcode" is in spf is so it is
          written out in serialization. This enables old versions of Locale to
          get the data they need - it's maintaining the contract of the public
          API, for back-compatibility). ]

          Now let's consider evolution. In this case, the evolution is that the
          new Locale has two fields ("script" and "extension") that aren't
          present in the serialized data. It is exactly as you would expect:
          "get()" instead returns the default value of its second argument.


          3. JSX's bug
          JSX implements most of this correctly, so that "get()" will retrieve
          the serialized fields. It also implements the default correctly: if
          the requested field isn't present in the serialized data, the default
          is returned instead. The problem is in an extra check JSX does.

          In the situation, the JOS specification requires that if you attempt
          to "get()" a field that is not in the spf, an exception is thrown.
          Although this is a bit fussy, it does ensure that the spf accurate
          represents the fields that you can read - it enforces the field API.
          JSX also checks this correctly.

          The bug is one step further back: JSX wrongly determines what the
          fields are. From my comments, I thought that all fields in the spf had
          to *also* be actual instance fields - which, having read this far,
          you'll know is incorrect. JSX actually explicitly tested each virtual
          field in spf, and if it's not an actual field, JSX ignores it.

          So, when the 1.7 Locale tries to "get()" the serialized "script"
          field, JSX sees that it's not present, but because returning the
          default value, it checks that "script" is in spf. But, alas, because
          "script" is not an actual field, it was ignored - and so the check
          fails, and the exception is thrown.

          "How could I make such a mistake?" you ask. Well, I think it's because
          spf are also used in another way: in simple default deserialization,
          when you don't have any special custom code, spf is used to determine
          which subset of actual fields have values read into them. It is an
          alternative to transient: instead of marking which fields should not
          be serialized with transient, you specify which fields should be
          serialized. This is convenient if you have many instance fields but
          only a few are serialized. In that case, JSX's check is sensible.
          However, the way JOS seems to work is it just tries to do simple
          default deserialization, If a field is not present in the serialized
          data, it just ignores the issue. It's the same bahaviour as when spf
          is not used, and the evolution is a simple adding of actual fields.
          When trying to those fields are not present in the serialized data,
          but JOS just silently moves on, leavin
          g the instance field at its default value for its type (e.g. null for
          reference fields; 0 for ints etc)



          4. Why hasn't this bug come up before?
          It seems to be a rare case. There's no bug for class evolution where
          spf isn't used. And there's no bug for class evolution where spf is
          used, but the fields it specifies have not changed. This is an
          important use of spf: the specification talks about spf being used to
          keep the serial format the same, despite changes in the internal
          implementation. That's also how it is illustrated in the
          specification's example (File.java on the last page). This is
          information hiding, where the public interface of an API doesn't
          change, but the implementation does.

          The bug only occurs when spf is used, and the fields it specifies
          differ from those that were serialized - that is, it's an evolution of
          the "API" or interface that spf defines, rather than only an evolution
          of the internal details. This is reasonable - it's analogous to a
          public API having new methods added to it. It's certainly a valid and
          important use of the JOS specification that JSX should handle; it's
          just that it doesn't seem to be that commonly used.

          Locale in Java 1.7 might be the only class to use spf to evolve its
          fields, and so trigger this bug.

          Note that JSX has no problem deserializing to a 1.7 Locale, when it
          was also serialized by 1.7 Locale - the bug only affects data
          serialized from a previous version of Locale. Thus, transient
          persistence and transmission to another copy of the same version would
          not have triggered this bug.

          Still, JSX has been used commercially for 10 years (since Oct 2002),
          so it does seem a little surprising that it hasn't come up before.
        Your message has been successfully submitted and would be delivered to recipients shortly.