Loading ...
Sorry, an error occurred while loading the content.
 

Re: RE�: [json] __jsonclass__ hinting

Expand Messages
  • Atif Aziz
    ... The slash syntax is only used for naming the type, but JSON is certainly being leveraged to describe the structure of a type as show in the Person example:
    Message 1 of 5 , Jul 25, 2006
      > My concern is that, rather than leveraging JSON as a rich syntax for
      > expressing type metadata, you are inventing a new syntax based on
      > slashes, a slippery slope that will lead to a mini language of its
      > own.

      The slash syntax is only used for naming the type, but JSON is certainly
      being leveraged to describe the structure of a type as show in the
      Person example:

      { "name" : "string",
      "sirname" : "string",
      "birthday" : "string/time"
      "children" : "array/object/person"
      }

      The slash-like approach is really there to qualify the simple and
      primitive types like String with further precision. How else can you say
      that a string really contains at date and time? Have you checked out the
      JSON formatted response from one of Yahoo's services[1]. They encode
      time using the Unix time format, which yields a number, but then go on
      to put it inside of a String rather than a Number. Short of twisting
      Yahoo's hand to give you a static type hint in the response,
      "string/time" seems to me the simplest and most straightforward way to
      describe exactly that. Anything else resulted in a scheme that seemed
      like an overkill for the problem at hand.

      > (Look at what happened with MIME types. They started simple, now
      > you have things like:

      I haven't really seen this complaint anywhere else and MIME types still
      seem to be reasonably in control compared to what the spec really does
      permit. I do understand your concern and so I don't wish to allow any
      kind of additional attributes to express limits and other facets of a
      type. So this example would never happen, or rather, is out of scope:

      > Following this approach will presumably lead to things like:
      >
      > "array/string/w3time;nullable=true;required=true;maxentries=5"
      >
      > to express "a required array of up to 5 dates, or null".
      >
      > or
      > "number/int;min=0;max=100;"

      I say out of scope because these particular example are more about
      validation rules, claims and promises (mainly from the producer) than
      simply communicating type structure. Nullable, required and maximum
      entries do not change the fundamental type, its precision or the
      semantics. If you want validation, then this could be worked in so:

      { "name" : "string",
      "sirname" : "string",
      "birthday" : "string/time"
      "children" : {
      type : "array/object/person",
      maxLength : 10,
      nullable : false
      }
      }

      I've come with the above at the spur of the moment so please don't take
      the setup I've presented as literally my recommended way to go. What I
      am trying to demonstrate is that we may very well be more in agreement
      than you initially thought, especially with respect to where the slash
      syntax may have its place.

      As for the number/int example, I'm hoping that the standardization of a
      few would serve as a starting point and which would also define the
      limits in some way. For example, "byte", "int16", "int32", "int64" and
      so on would be cover a lot of the cases and the type-names are unique
      and indicative of their bit lengths (and therefore implied limits).

      > www.jitsu.org/jitsu/guide/data.html). We found we also needed enums;
      > restrictions; hooks for hanging documentation comments, and a
      > mechanism for adding additional application-specific attribute
      > metadata (e.g. to add a tag saying how something is stored on the
      > backend, or indicate what C# type to use for a JSON-RPC type etc).

      We can talk about the syntax and structural details as soon as we're
      working on the same problem, and that is providing a machine-readable
      way of describing the expected structure of some JSON data in advance to
      (rather than during) the reception of the data.

      > I agree you need some kind of syntax is to express arrays -
      > programming languages already have a compact convention for this which
      > everyone knows: "Person[]".

      Yes, it's certainly more terse and it's my preference too. However, it
      doesn't help in the case of subtyping other values. I wanted to come up
      with a generic syntax first, which always leaves room for mentioning
      shorthand versions. The generic syntax can be read and interpreted left
      to right, making parsing simple. The use of "[]" is certainly on my list
      of things to consider.

      > So my reaction would be more in a direction like:

      You reaction is not too far from mine. :) I guess throwing some ideas
      out in their unfinished form invites more trouble than help, but I was
      hoping that the Person example I gave would allow people to see where it
      could go.

      > Everyone working with JSON in typed languages is adding some kind
      > of type hinting, the need for this is very apparent.

      I didn't say it will *never* take off. The question is about the problem
      you're trying to solve. My opinion is that there's a more urgent problem
      waiting and run-time type-hinting as you may want to implement it could
      certainly have a dependency on it. How does it help to know that
      "__jstype" is a person when I don't have a standard way to describe what
      person data could look like?

      > JSON doesn't have a way to do this - JSON is a bit like Xml but with
      > only six element names (the ones you listed). There is no way to add
      > new element names. All the objects in JSON are anonymous, you just
      have:
      >
      > {
      > "firstName":"Joe",
      > "age":22,
      > }

      That's true. Given just this piece of JSON data, you can't say anything
      about it *unless* you consider *where* it came from. If the source
      end-point has a way to describe its structure and the structure of its
      results then you've got what you wanted. Otherwise, yes, you have to let
      the caller on the consumer side tell you how to interpret and map that
      data. You'll have to assume they got the type out of some documentation.

      > I don't see how adding type hints changes security issues. Any server
      > should still check that the data conforms to its internal schema. No
      > change there.

      If it is checking against the wrong schema based on the type hint then
      there's trouble looming, no?

      Running out of time right now, so just two quick comments...

      > You make a claim that "parties that don't understand the extension can
      > still work with the basic types they know about." I see this as a
      > false claim.

      Care to expand? Can't let you get away with just that. ;)

      > There as two cases: either you care about types (in which case you
      > need to reason about types in as straightforward a way as possible) or
      > you don't care about types (in which case the six types in raw JSON
      > messages are all already well-enough defined). Can you explain where
      > this middle category of "caring a little about types" comes in?

      There isn't any such category. I'm in the first category. Hope of some
      of the explanations here help reflect that.

      [1]
      http://api.search.yahoo.com/NewsSearchService/V1/newsSearch?appid=YahooD\
      emo&query=madonna&results=3&language=en&output=json
      <http://api.search.yahoo.com/NewsSearchService/V1/newsSearch?appid=Yahoo\
      Demo&query=madonna&results=3&language=en&output=json>


      --- In json@yahoogroups.com, "meyer_jon" <jonmeyer@...> wrote:
      >
      > --- In json@yahoogroups.com, "Atif Aziz" atif.aziz@ wrote:
      > >
      > > What do you think of the the ideas presented in message #473:
      > > http://groups.yahoo.com/group/json/message/473
      >
      > My concern is that, rather than leveraging JSON as a rich syntax for
      > expressing type metadata, you are inventing a new syntax based on
      > slashes, a slippery slope that will lead to a mini language of its
      > own. (Look at what happened with MIME types. They started simple, now
      > you have things like:
      >
      > "application/xhtml+xml; charset=utf-8"
      >
      > )
      >
      > Following this approach will presumably lead to things like:
      >
      > "array/string/w3time;nullable=true;required=true;maxentries=5"
      >
      > to express "a required array of up to 5 dates, or null".
      >
      > or
      > "number/int;min=0;max=100;"
      >
      > etc.
      >
      > You make a claim that "parties that don't understand the extension can
      > still work with the basic types they know about." I see this as a
      > false claim.
      >
      > There as two cases: either you care about types (in which case you
      > need to reason about types in as straightforward a way as possible) or
      > you don't care about types (in which case the six types in raw JSON
      > messages are all already well-enough defined). Can you explain where
      > this middle category of "caring a little about types" comes in?
      >
      > You want to address proxies for typed languages. We've just done
      > similar work for Jitsu using Xml (see
      > www.jitsu.org/jitsu/guide/data.html). We found we also needed enums;
      > restrictions; hooks for hanging documentation comments, and a
      > mechanism for adding additional application-specific attribute
      > metadata (e.g. to add a tag saying how something is stored on the
      > backend, or indicate what C# type to use for a JSON-RPC type etc).
      >
      > I agree you need some kind of syntax is to express arrays -
      > programming languages already have a compact convention for this which
      > everyone knows: "Person[]".
      >
      > So my reaction would be more in a direction like:
      >
      > {
      > "name":"Person",
      > "docs":"My person class",
      > "properties":[
      > {
      > "name":"firstName",
      > "type":"string",
      > "docs":"The persons first name",
      > },
      > {
      > "name":"birthdate",
      > "type":"string",
      > "encoding":"iso8601",
      > "docs":"The persons birthdate, encoded as a an ISO string",
      > }
      > {
      > "name":"friends",
      > "type":"Person[]",
      > }
      > ]
      > }
      >
      > etc.
      >
      > > What hinting solves is the problem of polymorphism, a bit how XML
      > deals with it through xsi:type in the instance document.
      >
      > Yes. But consider that in Xml, you typically do:
      >
      > <Person>
      > <firstName>Joe</firstName>
      > <age>22</age>
      > </Person>
      >
      > In other words, by design Xml is implicitly a far more redundant
      > format than JSON.
      >
      > JSON doesn't have a way to do this - JSON is a bit like Xml but with
      > only six element names (the ones you listed). There is no way to add
      > new element names. All the objects in JSON are anonymous, you just
      have:
      >
      > {
      > "firstName":"Joe",
      > "age":22,
      > }
      >
      > As soon as you have any non-trivial data structure (even just a
      > Hashtable of objects), you need a __jstype mechanism. This need is
      > much greater in JSON than xsi:type is in Xml.
      >
      > I don't agree with your arguments for why a type hint would never take
      > off: Everyone working with JSON in typed languages is adding some kind
      > of type hinting, the need for this is very apparent.
      >
      > Some specific responses to your comments:
      >
      > > Security. It may not be wise to blindly accept the type
      > > that the producer might want you to think a particular JSON data is.
      >
      > I don't see how adding type hints changes security issues. Any server
      > should still check that the data conforms to its internal schema. No
      > change there.
      >
      > > Decoupling. The producer or consumer may see the logical
      > > to physical type mapping different on each end of the wire.
      > > For example, the producer might use a single person class but
      > > an enum field to distinguish the actual types. The consumer,
      > > on the other hand, may wish to use discrete classes.
      >
      > Different implementations can of course choose different strategies
      > for mapping types. Adding hints doesn't change that, it actually makes
      > this task easier.
      >
      > > The problem I see with type-hinting at run-time is
      > > that it's not just limited to reserving a property name.
      > > It also entails describing how to use that facility and
      > > this is a difficult exercise that produces limiting results. ...
      > > It's also not a new problem because ORM tools face the same
      > > issue with SQL result sets.
      >
      > Every object oriented language out there (including JavaScript)
      > supports named classes which have a set of properties (instances/slots
      > whatever you call them).
      >
      > Today, JSON specifies how to encode properties, but doesn't specify
      > how to encode what classes those properties belong to. This is just a
      > simple shortcoming, its not about how to use those classes, what those
      > classes mean, the service contact between client and server,
      > polymorphism or anything else. Unless we can agree on a solution, JSON
      > is a handicapped serialization format, because everyone will invent
      > their own workarounds for this issue.
      >




      [Non-text portions of this message have been removed]
    • meyer_jon
      ... I don t yet see how you define time or person . How do you support enums? What about multidimensional arrays? Subclassing? My impression is that in
      Message 2 of 5 , Jul 25, 2006
        I guess I need to wait for the full document - in your example:

        > { "name" : "string",
        > "sirname" : "string",
        > "birthday" : "string/time"
        > "children" : {
        > type : "array/object/person",
        > maxLength : 10,
        > nullable : false
        > }

        I don't yet see how you define "time" or "person". How do you support
        enums? What about multidimensional arrays? Subclassing?

        My impression is that in formulating the spec you are mostly thinking
        about RPC being used to sending normalized data tables e.g. from a
        database - a scenario which fits well with using an external schema
        definition. The Yahoo service you mention is one such example. As you
        say, you want to:

        > providing a machine-readable
        > way of describing the expected structure of some JSON data
        > in advance to (rather than during) the reception of the data.

        You also point out that

        > Given just this piece of JSON data, you can't say anything
        > about it *unless* you consider *where* it came from. If the source
        > end-point has a way to describe its structure and the structure
        > of its results then you've got what you wanted.

        Your assumption here is that the end-point returns only one type of
        regularized data. But our app has RCP endpoints like:

        object GetPageData();

        where, according to how the user has configured their page, different
        objects are returned. Think of netvibes (www.netvibes.com). Presumably
        they have a call like:

        object[] GetColumnContents()

        which returns different types of objects according to what modules the
        user has dragged onto a page.

        In this scenario, RPC is being used not to send database tables but
        rather dynamic graphs of objects. For us, knowing "where it came from"
        and "in advance" cannot tell us the structure of the data.

        This seems like a very reasonable use of JSON-RCP, but it requires
        support for sending dynamically typed data, where you can do:

        var x = server.GetPageData();
        if (x instanceof Person) { ... }

        This is where runtime __jstype comes in.

        You could argue that this is a niche use of RPC. But my impression is
        that, as Ajax matures, you will see much more of this kind of thing -
        and the standards process is slow so its important to be forward looking.

        Regarding:

        > How does it help to know that
        > "__jstype" is a person when I don't have a standard way to describe what
        > person data could look like?
        >

        In our case we've already defined a Person class on both the client
        and the server. So our app already knows what a Person looks like. We
        just want to be able to pass Person instances back and forth using RPC.

        >
        > > I don't see how adding type hints changes security issues.
        > If it is checking against the wrong schema based on the
        > type hint then there's trouble looming, no?

        Secure servers are written defensively. They never assume that
        the messaging layer produces valid messages. Any data from a client
        is checked against an internal schema maintained on the server.
        Changing __jstypes in a message might confuse the Serializer, but it
        shouldn't get past the internal security checks.

        > > You make a claim that "parties that don't understand the extension can
        > > still work with the basic types they know about." I see this as a
        > > false claim.
        >
        > Care to expand? Can't let you get away with just that. ;)

        My point is if you only care about the 6 basic types, you don't
        need a type schema at all, because JSON is already happily
        unambiguous when it comes to those 6 basic types. If you receive a
        message, there's no way you can mistake a string for a bool for a
        number. That's one of the great things about JSON.
      • Shawn Silverman
        ... Just some thoughts.. I was experimenting with different ways to name things and came up with these: 1) Name the anonymous members using some sort of
        Message 3 of 5 , Jul 27, 2006
          > new element names. All the objects in JSON are anonymous, you just have:
          >
          > {
          > "firstName":"Joe",
          > "age":22,
          > }
          >
          > As soon as you have any non-trivial data structure (even just a
          > Hashtable of objects), you need a __jstype mechanism. This need is
          > much greater in JSON than xsi:type is in Xml.
          >
          > I don't agree with your arguments for why a type hint would never take
          > off: Everyone working with JSON in typed languages is adding some kind
          > of type hinting, the need for this is very apparent.

          Just some thoughts.. I was experimenting with different ways to name things and came up
          with these:

          1) Name the anonymous members using some sort of extended syntax.

          <Person>{
          "name": "Fred",
          "age":22
          }

          2) Alternatively, allow one of these inside an object:

          {
          <Person>,
          "name": "Fred"
          }

          3) JSON already has a way to do this (sort of):

          {
          "Person":{ "name":"Fred", "age":22 }
          }

          4) This could be extended:

          {
          "Person":{ "name":"Fred", "age":22 }
          "isInstance":true
          }

          Or we could have "isClass", etc.

          5) This could be used to define types instead:

          <Person>{
          "property":"name",
          "property":"age"
          }

          6) This could be shrunk:

          <Person>["name", "age"]

          7) Or:

          <Person>[{"name":"string"},{"age":number}]

          etc...

          I think of JSON more as the "lexing" layer with a few extra types, and then the meaning is
          tacked on by some sort of external program or schema.

          Just some random thoughts on the subject...

          -Shawn
        Your message has been successfully submitted and would be delivered to recipients shortly.