Loading ...
Sorry, an error occurred while loading the content.
 

Re: groove: meta

Expand Messages
  • decent@rocks.net
    Hi. Ray Ozzie here. Hope that you find this to be useful. ========== How to interoperate with non-COM tools? Can a developer get raw socket access, SOAP
    Message 1 of 2 , Mar 20, 2001
      Hi. Ray Ozzie here. Hope that you find this to be useful.

      ==========

      How to interoperate with non-COM tools? Can a developer get raw
      socket access, SOAP support, or roll their own protocol? Is there
      any way to plug into the framework without buying in completely?

      <a>

      Internally, we use a very thin subset of COM (much akin to the
      capabilities of xpcom), but enough that when running under Windows we
      can indeed robustly interoperate with the vast array of components
      written over the past years. As you'd see if you download the GDK
      from devzone.groove.net, Groove is, in essence, an incredibly rich
      component framework that defines a new collaborative application
      abstraction. Because of our use of COM (and in particular the
      scripting host stuff) you can build Groove applications in any of
      dozens of COM-compliant languages such as VB. That said, the vast
      majority of Groove applications (and 100% of all of the ones that
      we've built) are built in JavaScript, typically using SOAP (or other
      products' API's) for systems integration.

      When a tool is implemented in Groove, it is running in an un-
      sandbox'ed environment - for better and worse. Digitally-signed
      dynamically downloaded componentry. Yes, you can get raw socket
      access, roll your own, etc. Party on, provided that the ultimate
      user allows the components to be downloaded to his or her device.

      Regarding your last question - the answer is "it depends". It's
      incredibly powerful and pliable in many dimensions ... but there is
      an app development paradigm for building tools in the Groove
      environment that defines the nature of the beast. For example, we
      require you to build your app in a strict smalltalk-esque MVC model.
      The data model level components must run free-threaded (which makes
      coding more interesting), while the higher-level UI stuff is
      serialized to make life easier for the scripting-level programmer.
      The entire product is asynchronous and event-driven, so you really
      have to be familiar with that model of writing code or else you'll do
      nasty and unsociable things to the rest of the system. And so on.

      </a>

      ==========

      Any plans for non-Windows ports?

      <a>

      We have lots of plans and dreams, but we've had little bandwidth to
      pursue them. We have indeed made (and continue to make) a
      substantial investment in wine enhancements (via our talented
      subcontractor, Macadamian) to keep it running on Linux, and are
      steadily working to build componentry for that environment that
      parallels some of the rich controls that we utilize in Windows, e.g.
      using Mozilla as an alternative to MSIE. I'd love to do an OS X
      port, but we just don't have the resources available right now.

      </a>

      ==========

      Regarding these points:
      * Information can be gathered relating to failure(s)
      * Nonfunctional components can be repaired or replaced dynamically
      * Remote diagnostics can be performed
      * Automatic process cleanup and recovery

      Seems to imply sandboxing or process control of some kind. True?

      <a>

      We are looking forward to building a protected execution environment
      via CLR (and maybe JVM at some point) in order to make it easier to
      distribute unsigned componentry. At this point, however, it's still
      all digitally-signed componentry and all that that implies.

      Related to failures, we have a very sophisticated subsystem (Customer
      Service Manager, a.k.a. CSM) that gathers exception information and
      sends it back up to the mother ship via SOAP. The software has the
      capability of self-updating itself upon a variety of conditions, but
      at the current time we've got it throttled back to the level where
      the user must push a button to get the updates initiated. Any
      component in the system - of which there are thousands - can be
      updated (except, of course, for a tiny loader responsible for
      restarting if some low-level stuff needs to be updated).

      </a>

      ==========

      Can anyone run a relay node? Are the specs for talking to one public?

      <a>

      Today we are running the only relays. We will indeed ultimately be
      making the specs public, but the xpi isn't yet completely stable
      because we're still learning how to scale this thing. Case in point:
      we've just switched from static load balancing to an intense
      clustered implementation, requiring some changes.

      Architecturally, the system is designed to work like email: each
      recipient has a designated [logical] relay server, to which senders
      enqueue packets. There's no central directory of relays, and it'll
      scale to the size of the net. It would be unlikely to do so, of
      course, if we were the only ones running relays.

      </a>

      ==========

      Can I write a clone? Granted that this would be a mammoth task, what
      aspects of the Groove architecture are not proprietary?

      <a>

      Honestly, I don't know how to answer that question. The simplest
      answer is: due to the inherent (and necessary) complexity, it's not
      likely to be practically doable, even if all of the source code were
      public - which it is not.

      Yes, you'd find it to be a mammoth task; we've got a bunch of pretty
      damned good systems people who have been working on this for over
      three years, and it's about two million lines of code, largely c++.

      Just to give you an idea, the major subsystems that you'd have to
      whip up would include things like:

      - a persistent xml object storage subsystem with a "virtual memory"-
      like API, with four underlying "service provider" implementations:
      binary object store, native OS file system, ZIP, and MHTML.
      - an xml object routing subsystem with semantics much akin to a
      message queueing system, working pure peer or via relays
      - an adaptive communications subsystem that works sociably in the
      background, dynamically adjusts to local resource issues (e.g.
      available sockets), dynamically detects and adapts to interface and
      address changes, uses multiple adapters automatically and
      concurrently even when some are "inside" and some are "outside" the
      firewall, does automatic/dynamic proxy configuration, adapts to any
      of a set of protocols automatically as needed to reach certain
      destinations, etc etc etc
      - a reliable transport system responsible for xml packet stream
      ordering and robustness
      - a LAN device presence subsystem for local device discovery (for
      pure peer LAN environments)
      - a WAN device presence subsystem for internet-wide device discovery,
      with an efficient publish-and-subscribe services interface
      - a security subsystem responsible for authenticating users in a peer
      fashion, encrypting on-wire packets, encrypting persistent object
      store data, etc
      - a distributed transaction synchronization subsystem (e.g.
      the "special sauce") that robustly creates the illusion of global
      consistency
      - a strict peer "awareness" system that securely enables peers (and
      only peers) to get selected awareness info, with no centralized
      servers
      - an application framework oriented around building "tools"
      in "shared spaces" with "members", and all that that implies

      ...and the list goes on, and on, and on, and on. I haven't even
      gotten up to the surface of the app yet, where there are hundreds of
      UI components, etc...

      If you look at Groove deeply, you'll understand that the product
      could never have been built with such a clean conceptual model unless
      we took a holistic architectural view. It took us many man*years to
      figure out the "right" factoring in order to accomplish what we set
      out to do at the UI. (The reason that we were in stealth mode for so
      long was that we had no idea when we'd finally get it "right". It
      took many, many subsystem rewrites and major architectural revisions
      before the data flow was right, the security model worked, the config
      dialogs could be eliminated, etc.)

      But now that we have a factoring that works, the cool thing is that
      we can now begin to *decompose* the app into chunks that may be
      reusable by a broad variety of other apps. For example, the first
      and most obvious thing to decompose is the xml message relaying
      system. Another may be the xml object transport system, or the xml
      object store. Factoring things *out* has two benefits: it lets other
      people leverage both the design and the implementation, and it lets
      us potentially substitute other similar layers as standards emerge.

      </a>

      ==========

      There is a big split in the 'filesharing/document manipulation/synch'
      world right now. One one side is the "The PC is the server now, so
      leave the file on disk and move the pointers to other users/machines."

      The other is "The hard drive is the only way to guarantee low latency
      and high availability, so synch the files themselves, rather than
      moving pointers."

      Systems that move pointers are admirably lightweight but don't get
      high availability in the case of network disconnection, planned or
      unplanned.

      Systems that move files have great availability, but suffer from
      version control issues, and from the fact that for perfect synch each
      node neads to hold

      D(1) + D(2) + D(3)+...+D(N) * N bytes of data

      where D(1) - D(N) are the amounts of data on 1-N synchronized
      machines. This maps to

      D(A) * N^2

      where D(A) is the amount of data on the average node. Ruh roh.

      Groove seems to have done a lot of work on the version control
      issue. How are you dealing with the exploding cache size issue? Are
      there plans for some data to reside centrally? Or to be held on local
      drives until synch is explicitly requested (as opposed to implicit
      requests, which is how Groove seems to work now)?

      <a>

      It's important to understand that Groove doesn't move or synchronize
      data. Unlike products that replicate or synchronize *data*, Groove
      is instead a distributed *transaction* system that, in essence,
      distributes *method calls* as opposed to data. In our architecture,
      the high-level components (model or view) create, in essence, XML
      procedure call descriptions (much as you'd do in XML-RPC or SOAP).
      These calls are treated as atomic transactions that are submitted
      locally and distributed securely to other nodes in the same shared
      space, where they're executed in parallel. (Of course, the tough job
      is ensuring global transaction ordering, particularly in a
      disconnectable environment. But I digress...)

      All of that said, it is up to the user-level application to determine
      precisely what to do with this distributed transaction
      infrastructure. For example, one application might use it to
      distribute sketchpad stroke directives, such as <draw-line
      from="156,234" to="123,54"/>, another might <chat-append text="yo,
      whazzzup?"/> or something.

      In Groove, the developer might (as we did) build a simple file
      sharing tool that moves data by value. Ours uses the standard
      Windows explorer control at the UI; when the user drags a file into
      the tool, we generate a transaction such as <add-file name="foo.doc"
      attributes="blah"><base64>yougettheidea</base64></add-file>.

      Another developer might (as we've prototyped) build a simple file
      sharing tool that moves data by reference. For example, it could
      again use the standard Windows explorer control at the UI, but it
      might generate a transaction such as <add-file name="foo.doc"
      attributes="blah"><present-at-endpoints>ray-laptop,jim-vaio,clay-
      thinkpad</present-at-endpoints></add-file>. In other words, the UI
      (when hovering, for example) could display who has the file, where it
      is, etc., and yet another transaction could be issued to fetch the
      file. (We support endpoint-directed and role-directed transactions
      also.)

      So ... my weasly way of getting out of your question is to say "have
      it your way". Time will tell us which tools are best to move data by
      value, which ones are best to move by reference, and what kind of UI
      to slap onto the by-reference tools so that you can get the data that
      you need when you need it.

      (Oh - as you probably expect, we also have a "bot server"
      architecture in which you can invite a server to be a proxy member.
      It can do things such as systems integration, or could indeed also do
      things such as being a proxy file repository. A mere matter of app-
      level code.)

      </a>

      ==========


      I spotted on Dan Gillmor's "Hailstorm" writeup
      (http://weblog.mercurycenter.com/ejournal/),

      <quote>Each separate demonstrator (eBay, Groove, American Express,
      among others) had created a tab inside the Microsoft messaging
      client</quote>

      That surprised me. What was the Groove tab doing? Where do you see
      the overlap / integration between Groove and Hailstorm (or with .NET
      in more general terms)?

      <a>

      Groove was designed with a philosophy that Groove Networks, Inc., has
      no business being a repository for your data unless you explicitly
      choose to share it with us. This includes even such basic things as
      your identity, your contacts, etc. (We're probably one of the only
      systems around that does what you'd typically refer to as "user
      awareness" without our servers having to know your identity or your
      friends' identities.) It's pretty cool.

      That said, even though Groove doesn't require you to list yourself in
      some mega-directory in order to use the product, people sometimes
      LIKE to be listed in directories, give out their contacts, etc., so
      that others can find them to groove with them. It's for this reason
      that we put in features such as "invite via email", so that you can
      leverage the contact list that you already have in Outlook/etc.

      What we did in our Hailstorm demo was conceptually very very simple:
      We figured that people might like to 1) identify themselves to Groove
      by using their Passport identity, for single-logon purposes and so
      that other Passport users can recognize them, 2) store their contacts
      inside the centralized Hailstorm contact repository, so that the same
      contacts are accessible to Groove, Messenger, the Windows XP shell,
      and any other app that wishes to use them, and 3) we integrate a
      little applet into the Messenger UI so that you can see at a glance
      what shared spaces people are actively working within, etc., in a
      manner consistent with the display of the Messenger buddy list, AND
      so that in two clicks you can invite someone to a Groove shared space
      without ever leaving Messenger. Etc.

      The main user benefit of Hailstorm as we're using it is to put the
      user in control of their own data in an app-independent and device-
      independent fashion. I think that for people who use Messenger
      and/or Windows XP, it'll provide a very convenient way to launch into
      Groove and to stay aware of what's going on in Groove. Let there be
      no doubt that Hailstorm is about the most centralized model that you
      can find, whereas Groove is probably pretty close to the other end of
      the spectrum. This integration simply shows that you can indeed get
      user benefit by "appropriate" integration of the center and the edge.

      </a>

      ==========

      Said Ray in email:
      > The product is structured as both a solution platform and
      > an end-user application

      I spent a while wondering about what a solution platform is, and why
      a solution platform would need to exist at all.

      I think that a solution platform means a very high-level API, with
      templates for network architecture on top of the normal GUI builder.
      Groove provides a soup-to-nuts environment, encapsulating the
      environment at a higher level than Java or even Windows.

      First, there is a lot of creative work at a low level. Persistence
      based on local XML storage(*), with transaction support in a non-SQL
      environment, with on-disk encryption. The relay hub, with multi-node
      coordination and store-and-forward support for transient nodes.
      Simple symetrical transmission protocol; a proprietary protocol I
      believe. All this stuff together encapsulates the conceptual
      environment.

      Second, there are all the trimmings. A skinnable UI. A bunch of
      tightly integrated apps - the IM app, shared spaces app, graphics
      app, etc. Ensemble browsing. The relay hub is self configuring.
      Components can be repaired or replaced dynamically and, I assume,
      remotely. All this stuff together encapsulates the working
      environment.

      Groove simplifies development in a heterogeneous world by making a
      homogenous world. There is less need to think about missing
      components, mismatched DLLs, firewall configuration, or any other
      local oddities, hence multiple nodes can be coordinated more easily.

      Is that a reasonable reading, Ray?

      <a>

      A good description, but I take slight issue with the last paragraph.
      Yes, we do indeed make development simple for our own environment.
      The kind of "eversync" synchronization that we provide is incredibly
      difficult to achieve robustly, and it was our goal to create an
      environment in which you could build some very useful stuff very
      easily that was automatically synchronized across many users and all
      of their devices.

      But we live in a heterogeneous world (computing and otherwise), and
      to view it otherwise would be to have blinders on. From day one, we
      designed Groove to fit into a heterogeneous world by embracing
      others' components, by integrating first XML-RPC and then SOAP, by
      embracing XML bottom-to-top, and so on. We've spent a good deal of
      time and effort recruiting and training systems integrators and
      consultants. To the best of my knowledge, virtually every Groove
      developer that we've been in contact with is doing some form of
      systems integration. They don't just do a simple tool; rather, they
      build tools in Groove that do something special in conjunction with
      their other systems, either via bots or via other methods of
      interconnection such as direct client-side SOAP calls.

      Based upon what I've seen already, I know that we'll be successful in
      building a system that will be of real and substantive value to our
      customers (who today are enterprise customers). But I also hope that
      ultimately both the architecture and implementation of this radically-
      distributed system will cause systems designers and developers to
      step back, gasp, and realize that there is truly life after the 70's
      OS model, the 80's GUI model, and the 90's Web model. And it's a
      blast...

      </a>

      ==========

      (Not to be quoted/forwarded outside of this mailing list without
      explicit permission, please. Thanks.)

      -- Ray
    Your message has been successfully submitted and would be delivered to recipients shortly.