Loading ...
Sorry, an error occurred while loading the content.

Re: [JSX] JSX on Java 1.7: issue with java.util.Locale

Expand Messages
  • Joël Bourquard
    Hi Brendan, It s a pleasure! I m glad that the first fix is working. Well done :-) That s not urgent because we need to fix the other issues we have with Java
    Message 1 of 12 , Oct 4, 2012
    • 0 Attachment
      Hi Brendan,

      It's a pleasure! I'm glad that the first fix is working. Well done :-)

      That's not urgent because we need to fix the other issues we have with
      Java 7 as well, but if you could look into it some time within the next
      10 days or so, it would be awesome.

      Thank you!

      Joël



      On Thu, 2012-10-04 at 06:34 +1000, Brendan Macmillan wrote:
      >
      > Hi Joël,
      >
      > > I also tried to serialize the result and then deserialize it back,
      > and
      > > it looks fine. If you need me to make additional trials, let me
      > know!
      >
      > Great, thanks for that. I'm pretty sure that that fix is right; I've
      > gone through the JOS specification now; and tried several test cases;
      > and both confirm it. I still need to check for interactions with the
      > rest of the code base.
      >
      > Just to confirm - this isn't urgent for you? It might be best for me
      > to spend a few days on it.
      >
      > cheers,
      > Brendan
      >
      > > On Wed, 2012-10-03 at 02:48 +1000, Brendan Macmillan wrote:
      > >>
      > >> Hi Joël,
      > >>
      > >> I haven't thoroughly checked the JOS spec yet, but I have a trial
      > >> bugfix, which works for my reproduction of the issue - would you
      > mind
      > >> getting some info from your codebase please, as a preliminary step?
      > >> The jar is on this page:
      > >>
      > >>
      > http://tech.groups.yahoo.com/group/JSX-ideas/files/Locale_test_bugfix/
      > >>
      > >> It should print out the 5 fields involved in Locale when it
      > >> deserializes (and only once), like this:
      > >> 3/10/2012 BUGFIX virtual field: language
      > >> 3/10/2012 BUGFIX virtual field: country
      > >> 3/10/2012 BUGFIX virtual field: variant
      > >> 3/10/2012 BUGFIX virtual field: hashcode
      > >> 3/10/2012 BUGFIX virtual field: script
      > >> 3/10/2012 BUGFIX virtual field: extensions
      > >>
      > >> Please let me know if there any other lines with BUGFIX.
      > >> And of course, hopefully there aren't any exceptions.
      > >>
      > >> Note: this is just a temp experiment, I still need to do more tests
      > >> and go through the JOS spec to confirm it is correct.
      > >>
      > >> cheers,
      > >> Brendan
      > >>
      > >>
      > >>
      > >>
      > >
      > >
      > >
      > >
      > > ------------------------------------
      > >
      > > Yahoo! Groups Links
      > >
      > >
      > >
      > >
      >
      >
      >
      >
    • Brendan Macmillan
      Hi Joël, Thanks for being so understanding, and having a specific time frame. That s helpful for scheduling. I thought it would be useful to record the
      Message 2 of 12 , Oct 4, 2012
      • 0 Attachment
        Hi Joël,

        Thanks for being so understanding, and having a specific time frame.
        That's helpful for scheduling.

        I thought it would be useful to record the problem and cause below, to
        help clarify it.
        You might find it interesting - but please don't feel any pressure to
        read it. It's long.

        cheers,
        Brendan


        I'd like to explain the issue, in these steps:
        1. the evolution of Locale between Java 1.6 and 1.7
        2. how JOS handles that class evolution
        3. how JSX handled it differently from JOS - that is, the bug
        4. finally, why this problem hasn't come up before in the 10
        commercial years of JSX


        1. Class evolution of Locale:
        Two fields were added to Locale (script and extensions) - but not in
        the way you would expect...

        The old Locale had 4 serializable fields (that is, non-static and
        non-transient):
        language
        country
        variant
        hashcode

        But the new Locale doesn't have any such fields! Instead, it defines 6
        serializable fields, like this:
        private static final ObjectStreamField[] serialPersistentFields = {
        new ObjectStreamField("language", String.class),
        new ObjectStreamField("country", String.class),
        new ObjectStreamField("variant", String.class),
        new ObjectStreamField("hashcode", int.class),
        new ObjectStreamField("script", String.class), // <-- new
        new ObjectStreamField("extensions", String.class), // <-- new
        };

        I think of these as "virtual fields" (meaning fake, faux, pretend),
        but that's not an official name. They are serialized by JOS and JSX
        exactly as if they were actual fields, so you can't tell they are
        virtual just by looking at the XML. They enable a class to pretend to
        have different fields. They are a compatibility layer, enabling you to
        change the actual fields of class, while still making it act as if it
        had the old fields. In other words, the serialized form is a kind of
        public API or interface; these virtual fields give you a loose
        coupling between that public interface and the implementation details.
        It's a kind of "information hiding", as in encapsulation, Abstract
        Data Types, and Parnas' paper on module decomposition
        http://en.wikipedia.org/wiki/Information_hiding#History

        [ BTW: So, if Locale lacks these fields, how does it store its
        information? It uses a BaseLocale object (sun.util.locale.BaseLocale)
        in a transient field, so it's not serialized. Being in the "sun"
        hierarchy, I thought source wasn't available, but here's the openJDK
        implementation:
        http://www.docjar.com/html/api/sun/util/locale/BaseLocale.java.html ]


        2. JOS and class evolution
        The problem occurs when XML serialized from the 1.6 Locale, is
        deserialized into the 1.7 Locale.

        Although the new Locale doesn't have any actual instance fields, it
        has specified its virtual fields with spf (above). That is, Locale is
        pretending that it has those 6 fields listed above. Note that these 6
        fields are stating what fields Locale has - not necessarily the fields
        in the serialized data.

        Before looking at evolution, let's consider the simpler case, where
        all 6 expected fields were serialized. How can Locale deserialize
        them, when it doesn't actually have those fields, to deserialize those
        values into? It has a special "readObject" method that JOS calls, in
        which these lines read in the serialized values:
        ObjectOutputStream.PutField fields = out.putFields();
        String language = (String)fields.get("language", "");
        String script = (String)fields.get("script", "");
        String country = (String)fields.get("country", "");
        String variant = (String)fields.get("variant", "");
        String extStr = (String)fields.get("extensions", "");

        For example, the first line get()s the "language" value from the
        serialized data, and puts it into a local variable, also named
        language.

        The second argument of "get()" is a default value, which "get()" will
        return if that field was not present in the serialized data.

        [ BTW: You might notice there are only 5 fields here, not 6 as in the
        spf. Although both new fields are present ("script" and "extension"),
        "hashcode" is not. This is because the 1.7 Locale doesn't use
        "hashcode", and so just ignores it. This does no harm, because the
        "putFields()" has already parsed all the serialized fields, into
        fields (which acts like a Map). So the get()s are just getting from a
        Map, and not parsing, thus it doesn't matter whether it read all or
        any of the values. (The reason "hashcode" is in spf is so it is
        written out in serialization. This enables old versions of Locale to
        get the data they need - it's maintaining the contract of the public
        API, for back-compatibility). ]

        Now let's consider evolution. In this case, the evolution is that the
        new Locale has two fields ("script" and "extension") that aren't
        present in the serialized data. It is exactly as you would expect:
        "get()" instead returns the default value of its second argument.


        3. JSX's bug
        JSX implements most of this correctly, so that "get()" will retrieve
        the serialized fields. It also implements the default correctly: if
        the requested field isn't present in the serialized data, the default
        is returned instead. The problem is in an extra check JSX does.

        In the situation, the JOS specification requires that if you attempt
        to "get()" a field that is not in the spf, an exception is thrown.
        Although this is a bit fussy, it does ensure that the spf accurate
        represents the fields that you can read - it enforces the field API.
        JSX also checks this correctly.

        The bug is one step further back: JSX wrongly determines what the
        fields are. From my comments, I thought that all fields in the spf had
        to *also* be actual instance fields - which, having read this far,
        you'll know is incorrect. JSX actually explicitly tested each virtual
        field in spf, and if it's not an actual field, JSX ignores it.

        So, when the 1.7 Locale tries to "get()" the serialized "script"
        field, JSX sees that it's not present, but because returning the
        default value, it checks that "script" is in spf. But, alas, because
        "script" is not an actual field, it was ignored - and so the check
        fails, and the exception is thrown.

        "How could I make such a mistake?" you ask. Well, I think it's because
        spf are also used in another way: in simple default deserialization,
        when you don't have any special custom code, spf is used to determine
        which subset of actual fields have values read into them. It is an
        alternative to transient: instead of marking which fields should not
        be serialized with transient, you specify which fields should be
        serialized. This is convenient if you have many instance fields but
        only a few are serialized. In that case, JSX's check is sensible.
        However, the way JOS seems to work is it just tries to do simple
        default deserialization, If a field is not present in the serialized
        data, it just ignores the issue. It's the same bahaviour as when spf
        is not used, and the evolution is a simple adding of actual fields.
        When trying to those fields are not present in the serialized data,
        but JOS just silently moves on, leavin
        g the instance field at its default value for its type (e.g. null for
        reference fields; 0 for ints etc)



        4. Why hasn't this bug come up before?
        It seems to be a rare case. There's no bug for class evolution where
        spf isn't used. And there's no bug for class evolution where spf is
        used, but the fields it specifies have not changed. This is an
        important use of spf: the specification talks about spf being used to
        keep the serial format the same, despite changes in the internal
        implementation. That's also how it is illustrated in the
        specification's example (File.java on the last page). This is
        information hiding, where the public interface of an API doesn't
        change, but the implementation does.

        The bug only occurs when spf is used, and the fields it specifies
        differ from those that were serialized - that is, it's an evolution of
        the "API" or interface that spf defines, rather than only an evolution
        of the internal details. This is reasonable - it's analogous to a
        public API having new methods added to it. It's certainly a valid and
        important use of the JOS specification that JSX should handle; it's
        just that it doesn't seem to be that commonly used.

        Locale in Java 1.7 might be the only class to use spf to evolve its
        fields, and so trigger this bug.

        Note that JSX has no problem deserializing to a 1.7 Locale, when it
        was also serialized by 1.7 Locale - the bug only affects data
        serialized from a previous version of Locale. Thus, transient
        persistence and transmission to another copy of the same version would
        not have triggered this bug.

        Still, JSX has been used commercially for 10 years (since Oct 2002),
        so it does seem a little surprising that it hasn't come up before.
      Your message has been successfully submitted and would be delivered to recipients shortly.