Re: JSX and string serialization
- Hi Bent,
I think Xrayrivet might suit your needs better. It is based on the JSX
engine, but does not alias Strings. The XML is also much simpler. I would
appreciate if you would try it out, and tell me what problems you have. I
can then iterate quickly, to make it work for you. How does that sound?
It does not mention your usage, but it sounds like a good match.
Warning: It is *very* alpha, but it's easy for me to improve it quickly
because the completeness of JSX is right there under the hood. It's more a
question of what to do first.
You are right that Strings can't form circular references, but there can be
multiple references to the same String (as you saw). JSX needs to work this
way to be correct for serialization: Strings are aliased so that == works
correctly if there are multiple references to the same String.
BTW: here's how to tell if references are circular: keep a stack of the
objects above you (from root to the current node). A reference is circular
if and only if it refers to an object in that stack.
But I'm more interested in making useful products; and I think your usage of
JSX is a popular one.
> I have tried out JSX2 and found the following minor annoyance. While I
> realize that it works like this because of issues caused by circular
> data structures (and for all I know, perhaps JOS _requires_ it to be
> like this), I still wanted to mention it.
> If I serialize an object graph that has a number of empty strings in
> it, then the first string will be serialized with its value and the
> rest will just idref it. This proves inconvenient when I later want to
> edit the XML files and input real strings. In particular, if I want to
> edit the one string that everyone else is referencing, I have to
> remember to move the reference string to somewhere else or all other
> strings will also have changed.
> It would have been much more edit-friendly if all strings were always
> serialized with their full values regardless of whether or not other
> strings in the object graph happen to have the same value.
> While the same argumentation holds for all other data types as well, I
> imagine it is non-trivial to determine if any given object would cause
> a loop. Strings, however, are guaranteed not to cause loops since they
> don't actually refer to other objects and they can't be subclassed to
> do so either.
> Again, I'm not sure if this approach would break other aspects of
> serialization. That's for you to know and for me to speculate over :-)
> Bent D
> Bent Dalager - bcd@... - http://www.pvv.org/~bcd
> powered by emacs
- Hi Bent,
Note: I've cc'ed this to the list, because I think others may be interested.
> > I think Xrayrivet might suit your needs better. It is based on the JSXwould
> > engine, but does not alias Strings. The XML is also much simpler. I
> > appreciate if you would try it out, and tell me what problems you have.I
> > can then iterate quickly, to make it work for you. How does that sound?It should work on primitives (unless within an array - it doens't handle
> > http://www.jsx.org/xrayrivet/xrayrivet.html
> I gave it a quick test run and found that I need more mappings than
> what it currently has. Specifically, it failed on primitives.
arrays at the moment).
> I am using JSX as a component in a development tool we're usingIt doesn't support arrays at the moment.
> internally, in which we work with moderately complex data structures
> (they're basically deeply nested structures generated from IDL struct
> declarations). As it is, it will at least need to support primitives,
> nested objects and nested arrays for it to be of any immediate use. I
> don't think our already pressed developers will have much patience
> with xrayrivet if it proves to have a lot of problems. (I'd say that
> we could try it out in a quiet period, but there aren't any :-)
Primitives should be OK - less of course, they are within an array :-)
Nested objects should be OK, as well as primitive values.
I need to ask: do you have *any* cyclic or multiple references (apart from
the Strings), that you need preserved?
The goal of the project is to be able to map any Java object graph to any
XML document, and
most XML documents don't provide a way to represent such references, and
introducing some mechanism (such as JSX's approach) would be an error in
terms of the XML document.
Arrays present two more problems for this goal:
(1). Arrays have a runtime length - but this can't always be recorded
explicitly, because many XML documents don't have an explicit length for
lists. In general, you also can't solve this by storing the lenght in a
separate mapping (or binding) document, because it can vary at runtime. The
"obvious" solution is to record the runtime length implicitly, in terms of
the components of the array. Just count them.
It's a little bit of work to implement this, because you have to do it for
each primitive type separately(mostly cut and paste tho, simulating
(2). Null values are needed by arrays of objects (for example, as <null/>) -
but many XML documents don't use a null element. The truth is, to map to
such a document, any null values found in the objects would be an error,
because there is nothing to map them to. Unfortunately, arrays of objects
quite commonly have an unused portion, of trailing nulls.
Of course, these considerations don't apply to your case, because you aren't
mapping to a target XML document. You just want to be able to enter Strings
by hand (IIUC)
> If, on the other hand, there is hope for getting it up to speedI'm not sure about this at the moment, but it would probably be the same as
> relatively easily, it is a somewhat more promising proposition. I am
> sure they _will_ appreciate the ease-of-edit they might be getting
> once it's ready for prime time. I've just finished a basic GUI-based
> editor for these data structures, though, and if they fall in love
> with that (I can only hope :-), they might not see the benefit of
> I would personally like to have the "just use emacs" fallback though,
> so I'll try to pitch it to them and see what they say. An
> easier-to-edit XML format would very handy after all.
> How will licensing work for xrayrivet?
> > It does not mention your usage, but it sounds like a good match.a
> > Warning: It is *very* alpha, but it's easy for me to improve it quickly
> > because the completeness of JSX is right there under the hood. It's more
> > question of what to do first.No - there is a declarative mapping that you write once, and which is used
> Is it "just" a question of writing the XSL scripts for it?
for mapping in both directions (with XSL, you'd have to write two scripts).
Plus, the mapping is specifically for Java and XML, so it is much simpler
for this specific task. It's "XML databinding".
> Using XSLYou can do it, but they aren't very readable. It depends on what you need to
> to morph JSX's output _does_ immediately strike me as a good idea, but
> I am somewhat wary of the complexity that might be involved in the XSL
> scripts. How readable do they become?
do. There are example scripts in the JSX manual (towards the end) for
evolving classes; and other example for XML databinding on the front page
(www.jsx.org). You can get some kind of a sense of the complexity.
> > JSX:be
> > You are right that Strings can't form circular references, but there can
> > multiple references to the same String (as you saw). JSX needs to workthis
> > way to be correct for serialization: Strings are aliased so that ==works
> > correctly if there are multiple references to the same String.Yes. I almost sent you a follow up last night, suggesting that; I'm glad my
> I can see that you don't want the serialize->deserialize cycle to
> break the == operator. Many data structures may rely on it after
> all. Reading between the lines (and extrapolating a bit), I am
> guesssing that I may get a long way if I put new String("") into my
> data structures rather than just "" ... I take it the refids only get
> inserted if string1==string2 and not necessarily if
> string1.equals(string2) ? (I build default instances of the IDL
> structs myself using reflection, so I control what initially goes into
explanation was clear enough for you to be able to put it to use right away.
> I suppose I should just try it and see what happens :-)Bottom line: if you do fit within the target goal above, then I estimate
1-2 weeks maximum until it's ready to go. But if you need some references
(cyclic or multiple), then it's a conflict with the above goal, and much as
I regret it, I can't do it as part of this particular project.
So, let me know. :-)
- Hi Bent,
I'll reply to your other comments in a separate email.
> > Perhaps a low-risk way for you to proceed is for neither of us to committo
> > anything? If I implement something that you need, you could check it(which
> > I think is pretty quick?), iterating around this loop until it does whatyou
> > want.Cool.
> Yes, this seems quite doable.
> > Questions:primitives?)
> > 1. Why do you have arrays instead of collections? (are they of
>Interesting, thanks for the background!
> I have a number of IDL files defining structs containing, among other
> things, IDL sequences. These get converted into Java classes by an
> IDL-to-Java compiler that we have no control over. The end result uses
> arrays and not collections to represent sequences (this may be
> required by the CORBA-to-Java mapping specification for all I know). I
> cannot change these classes. It is these IDL-originated structures
> that we want to edit in order to build arbitrary CORBA objects to send
> across the network for testing purposes.
> The arrays we use can hold primitives and they can hold objects.
> The primary reason I am using JSX in the first place is that the IDL
> compiler doesn't support making the generated classes Serializable and
> since I can't change the resulting code myself (well, I could, but it
> would be a nightmare) I needed something that could serialize any old
JSX is non-intrusive, which is a great strength when you have to (or prefer
to) not change existing code.
> As we discussed previously, I believe that I can relax that particularHmm... it is driven by JSX internally, so that these would be passed as
> requirement to "any old object with a non-cyclic member hierarchy".
> As a matter of interest, if you _do_ pass a cyclic hierarchy to
> xrayrivet, how will it react? Will it identify the problem and throw
> an exception?
references. At the moment, xrayrivet just ignores references, but for a
final implementation, it should thrown an exception (which would be
switchable on/off). IOW, this is polish, which is easy to deal with later.
> > 2. Do you have nulls in your object arrays? (even trailing)?Just to be 100% clear pedantic (because it makes a big difference later): in
> There certainly can be. While our CORBA implementation doesn't support
> sending null values, there can be null values in the structures while
> the developer is building them and he might very well decide he wants
> to save such an unfinished structure to file to continue work on it
structures, yes; but would there be nulls in *arrays*?
> Now, truth be told, I am somewhat ambivalent about letting theSo you only need it for development. OK - we'll see how this goes.
> developer put nulls into the structures since it would be a mistake to
> have them there when trying to send the object over CORBA (and that is
> the whole point after all). While we may decide to remove this
> possibility in the future (after the developers have some experience
> with using the tool), it will likely stay in for at least several
Additional Requirements Summary:
- arrays of primitives
- arrays of Objects
- structs with nulls in them
- arrays of Objects, with nulls in them (?)
- Hi Bent,
> > BTW: The microsoft C# guy criticises Java's generics for beingIt's from the Artima article, he mentions casting, and also that the
> > inefficient in this way IIRC, but probably in many cases it just
> > doesn't matter. I mean, if you really want efficiency, use C. But
> > computers are just absurdly fast these days, so it usually makes
> > no discernable difference.
> I expect he's criticising the implicit casting that goes on in Java
> generics. In theory, casting is expensive, but I'm not convinced that
> this is the case in a single-inheritance system such as Java. Checking
> the correctness of a cast in Java should really be quite cheap if
> you're clever about it.
inefficientcy of autoboxing:
# Anders Hejlsberg:
# For example, with Java generics, you don't actually get any of
# the execution efficiency that I talked about, because when you
# compile a generic class in Java, the compiler takes away the
# type parameter and substitutes Object everywhere. So the
# compiled image for List<T> is like a List where you use
# the type Object everywhere. Of course, if you now try
# to make a List<int>, you get boxing of all the ints.
# So there's a bunch of overhead there.
> Anyway, as you say, one really has to measure to find out for sure if:-) Yes, I think so. I kind of like the possibility of discovering I'm
> it's an issue for any particular application. In your case, just
> copying the algorithm multiple times is probably a more effecient
> approach than profiling both solutions and then choosing one :-)
wrong, when I do profile it in future.
> > JSX and xrayrivet would be sold as separate products, so if you wantedboth,
> > you would need two licenses. If you only wanted JSX or xrayrivet, thenit
> > would be one license.I had the idea of a making a single hidden version of JSX, that was
> As a practical issue, if you have xrayrivet won't you also effectively
> have JSX bundled with it? How would you prevent xrayrivet users from
> calling the JSX API directly? Does Java offer a solution for this in
> its security mechanisms (I don't think so, but am not entirely up to
> date on JAR features) or would you make a custom JSX (with everything
> having package scope in stead of public, for instance) for bundling
> that was effectively uncallable from the outside?
common to both, and having an extra wrapper class for JSX, with
public methods, that would only be present in the "JSX" jar.
This makes it easy to control at configuration time in ant. But it
does add an extra layer of complexity. Now that you raise it,
I think a runtime check is simpler and efficient (it's only checked
once per object graph). But I haven't given it much thought yet.
Thanks very much for thinking on this! :-)
hehehe I estimate that open source, without any worry at all
about security and business-related issues, is at least three
times easier than commercial software. If you aren't making
a reusable component, then it is three times easier again. And if
you forget ease of use, then it's yet another three times easier
(as esr claims often is the case). By this reckoning, such an
open source is 27 times easier than commercial software.