Loading ...
Sorry, an error occurred while loading the content.

Must multi-ref equivalence be preserved?

Expand Messages
  • Sam Ruby
    This question is motivated by the following test recently added to the SOAP4R client InteropResults:
    Message 1 of 41 , Nov 28, 2001
    • 0 Attachment
      This question is motivated by the following test recently added to the
      SOAP4R client InteropResults:

      <?xml version="1.0" encoding="utf-8" ?>
      <env:Envelope xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:env="http://schemas.xmlsoap.org/soap/envelope/" xmlns:xsi
      ="http://www.w3.org/2001/XMLSchema-instance">
      <env:Body>
      <n2:echoStringArray env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:n1="http://schemas.xmlsoap.org/soap/encoding/" xmlns:n2
      ="http://soapinterop.org/">
      <inputStringArray xsi:type="n1:Array" n1:arrayType="xsd:string[3]">
      <item href="#id538092000"></item>
      <item>SOAP4R</item>
      <item href="#id538092000"></item>
      </inputStringArray>
      </n2:echoStringArray>
      <item xsi:type="xsd:string" id="id538092000" env:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/" xmlns:n3
      ="http://schemas.xmlsoap.org/soap/encoding/">SOAP4R</item>
      </env:Body>
      </env:Envelope>

      Note that all three elements contain the same string of characters, but the
      first and the third are expressed in a way that makes them actually
      equivalent (i.e., the same objects).

      From an implementation point of view, some languages or platforms may treat
      strings as primitives, others as objects. Some may do automatic interning,
      others may not. These can affect how the input is canonicalized. Such
      canonicalization is done in many instances. For example, Axis will
      canonicalize certain Unicode characters into "&#nnnnn;" equivalents, in the
      case of boolean "0" will become "false", float "1.0E+01" will become
      "10.0", dates in other time zones will adjusted to GMT, etc. And as has
      been discussed at length in this mailing list, some implementations will
      treat nulls and omissions as interchangable.

      From an interop point of view, accommodation of alternate canonicalizations
      is a highly desirable feature.

      The implementations I am working on are Java based, so String is an object
      and I have control over canonicalization. This means that would not be
      overly difficult for me to adjust my implementation to accommodate this.
      The reasons I am not currently choosing to do so are twofold. The first is
      performance, but I certainly wouldn't give that higher importance than
      correctness. The second is interoperability - it seems that not all
      implementations support href's equally well, so we choose to avoid using
      this feature when possible.

      In fact, the current Axis implementation's strategy is essentially to only
      use hrefs when the data is potentially cyclic. Our heuristics for this at
      the moment are pretty simple and fast, meaning that we use hrefs for the
      echoStruct and echoStructArray tests even though they are provably not
      cyclic. Our interoperability with the Phalanx and TclSOAP endpoints suffer
      as a result.

      I'd appreciate other people's perspectives on this topic...

      - Sam Ruby
    • Sam Ruby
      ... That takes care of the two easy examples. ... Perhaps you are being flip. Let s take a look at a harder example. What would be the natural mapping of
      Message 41 of 41 , Dec 3, 2001
      • 0 Attachment
        Noah Mendelsohn wrote:
        >
        > I believe that both of the examples you give are covered by the schema
        > data types spec (latest version at) [1], which is referenced normatively
        > by SOAP 1.1. Schemas is very clear that there is a distinction between
        > the lexical and the value space for a type. Although {01, and 1} are
        > different lexical representations, they represent the same value.

        That takes care of the two easy examples.

        > I agree that SOAP could be clearer that what is intended in the encoding
        > is the values. As a member of the schema WG, I can say that I've been
        > disappointed that we didn't do a better job of specifying the exact
        > mappings. Although it's clearly implied that lexical 321 has the decimal
        > value 321, I don't think we anywhere say that it's not 123 or 213 or even
        > 589.

        Perhaps you are being flip. Let's take a look at a harder example. What
        would be the "natural" mapping of "2.00" be into Java?

        Per http://jcp.org/aboutJava/communityprocess/first/jsr101/ , the mapping
        would be to BigDecimal.

        Per http://www.w3.org/TR/xmlschema-2/#decimal , the canonical
        representation in general prohibits trailing zeros. Specifically, the
        canonical representation for "2.00" is "2.0".

        Per,
        http://java.sun.com/j2se/1.3/docs/api/java/math/BigDecimal.html#equals(java.lang.Object)

        , scale is significant. Specifically, "2.00" is not equal to "2.0".

        = = = = =

        My takeaway from this discussion is that by clearly specifying that
        trailing zeros are to be ignored, the clear and natural mapping that a Java
        programmer would assume for decimals is explicitly prohibited. IMHO, that
        would be a pity.

        - Sam Ruby
      Your message has been successfully submitted and would be delivered to recipients shortly.