Re: groove: meta
- Hi. Ray Ozzie here. Hope that you find this to be useful.
How to interoperate with non-COM tools? Can a developer get raw
socket access, SOAP support, or roll their own protocol? Is there
any way to plug into the framework without buying in completely?
Internally, we use a very thin subset of COM (much akin to the
capabilities of xpcom), but enough that when running under Windows we
can indeed robustly interoperate with the vast array of components
written over the past years. As you'd see if you download the GDK
from devzone.groove.net, Groove is, in essence, an incredibly rich
component framework that defines a new collaborative application
abstraction. Because of our use of COM (and in particular the
scripting host stuff) you can build Groove applications in any of
dozens of COM-compliant languages such as VB. That said, the vast
majority of Groove applications (and 100% of all of the ones that
products' API's) for systems integration.
When a tool is implemented in Groove, it is running in an un-
sandbox'ed environment - for better and worse. Digitally-signed
dynamically downloaded componentry. Yes, you can get raw socket
access, roll your own, etc. Party on, provided that the ultimate
user allows the components to be downloaded to his or her device.
Regarding your last question - the answer is "it depends". It's
incredibly powerful and pliable in many dimensions ... but there is
an app development paradigm for building tools in the Groove
environment that defines the nature of the beast. For example, we
require you to build your app in a strict smalltalk-esque MVC model.
The data model level components must run free-threaded (which makes
coding more interesting), while the higher-level UI stuff is
serialized to make life easier for the scripting-level programmer.
The entire product is asynchronous and event-driven, so you really
have to be familiar with that model of writing code or else you'll do
nasty and unsociable things to the rest of the system. And so on.
Any plans for non-Windows ports?
We have lots of plans and dreams, but we've had little bandwidth to
pursue them. We have indeed made (and continue to make) a
substantial investment in wine enhancements (via our talented
subcontractor, Macadamian) to keep it running on Linux, and are
steadily working to build componentry for that environment that
parallels some of the rich controls that we utilize in Windows, e.g.
using Mozilla as an alternative to MSIE. I'd love to do an OS X
port, but we just don't have the resources available right now.
Regarding these points:
* Information can be gathered relating to failure(s)
* Nonfunctional components can be repaired or replaced dynamically
* Remote diagnostics can be performed
* Automatic process cleanup and recovery
Seems to imply sandboxing or process control of some kind. True?
We are looking forward to building a protected execution environment
via CLR (and maybe JVM at some point) in order to make it easier to
distribute unsigned componentry. At this point, however, it's still
all digitally-signed componentry and all that that implies.
Related to failures, we have a very sophisticated subsystem (Customer
Service Manager, a.k.a. CSM) that gathers exception information and
sends it back up to the mother ship via SOAP. The software has the
capability of self-updating itself upon a variety of conditions, but
at the current time we've got it throttled back to the level where
the user must push a button to get the updates initiated. Any
component in the system - of which there are thousands - can be
updated (except, of course, for a tiny loader responsible for
restarting if some low-level stuff needs to be updated).
Can anyone run a relay node? Are the specs for talking to one public?
Today we are running the only relays. We will indeed ultimately be
making the specs public, but the xpi isn't yet completely stable
because we're still learning how to scale this thing. Case in point:
we've just switched from static load balancing to an intense
clustered implementation, requiring some changes.
Architecturally, the system is designed to work like email: each
recipient has a designated [logical] relay server, to which senders
enqueue packets. There's no central directory of relays, and it'll
scale to the size of the net. It would be unlikely to do so, of
course, if we were the only ones running relays.
Can I write a clone? Granted that this would be a mammoth task, what
aspects of the Groove architecture are not proprietary?
Honestly, I don't know how to answer that question. The simplest
answer is: due to the inherent (and necessary) complexity, it's not
likely to be practically doable, even if all of the source code were
public - which it is not.
Yes, you'd find it to be a mammoth task; we've got a bunch of pretty
damned good systems people who have been working on this for over
three years, and it's about two million lines of code, largely c++.
Just to give you an idea, the major subsystems that you'd have to
whip up would include things like:
- a persistent xml object storage subsystem with a "virtual memory"-
like API, with four underlying "service provider" implementations:
binary object store, native OS file system, ZIP, and MHTML.
- an xml object routing subsystem with semantics much akin to a
message queueing system, working pure peer or via relays
- an adaptive communications subsystem that works sociably in the
background, dynamically adjusts to local resource issues (e.g.
available sockets), dynamically detects and adapts to interface and
address changes, uses multiple adapters automatically and
concurrently even when some are "inside" and some are "outside" the
firewall, does automatic/dynamic proxy configuration, adapts to any
of a set of protocols automatically as needed to reach certain
destinations, etc etc etc
- a reliable transport system responsible for xml packet stream
ordering and robustness
- a LAN device presence subsystem for local device discovery (for
pure peer LAN environments)
- a WAN device presence subsystem for internet-wide device discovery,
with an efficient publish-and-subscribe services interface
- a security subsystem responsible for authenticating users in a peer
fashion, encrypting on-wire packets, encrypting persistent object
store data, etc
- a distributed transaction synchronization subsystem (e.g.
the "special sauce") that robustly creates the illusion of global
- a strict peer "awareness" system that securely enables peers (and
only peers) to get selected awareness info, with no centralized
- an application framework oriented around building "tools"
in "shared spaces" with "members", and all that that implies
...and the list goes on, and on, and on, and on. I haven't even
gotten up to the surface of the app yet, where there are hundreds of
UI components, etc...
If you look at Groove deeply, you'll understand that the product
could never have been built with such a clean conceptual model unless
we took a holistic architectural view. It took us many man*years to
figure out the "right" factoring in order to accomplish what we set
out to do at the UI. (The reason that we were in stealth mode for so
long was that we had no idea when we'd finally get it "right". It
took many, many subsystem rewrites and major architectural revisions
before the data flow was right, the security model worked, the config
dialogs could be eliminated, etc.)
But now that we have a factoring that works, the cool thing is that
we can now begin to *decompose* the app into chunks that may be
reusable by a broad variety of other apps. For example, the first
and most obvious thing to decompose is the xml message relaying
system. Another may be the xml object transport system, or the xml
object store. Factoring things *out* has two benefits: it lets other
people leverage both the design and the implementation, and it lets
us potentially substitute other similar layers as standards emerge.
There is a big split in the 'filesharing/document manipulation/synch'
world right now. One one side is the "The PC is the server now, so
leave the file on disk and move the pointers to other users/machines."
The other is "The hard drive is the only way to guarantee low latency
and high availability, so synch the files themselves, rather than
Systems that move pointers are admirably lightweight but don't get
high availability in the case of network disconnection, planned or
Systems that move files have great availability, but suffer from
version control issues, and from the fact that for perfect synch each
node neads to hold
D(1) + D(2) + D(3)+...+D(N) * N bytes of data
where D(1) - D(N) are the amounts of data on 1-N synchronized
machines. This maps to
D(A) * N^2
where D(A) is the amount of data on the average node. Ruh roh.
Groove seems to have done a lot of work on the version control
issue. How are you dealing with the exploding cache size issue? Are
there plans for some data to reside centrally? Or to be held on local
drives until synch is explicitly requested (as opposed to implicit
requests, which is how Groove seems to work now)?
It's important to understand that Groove doesn't move or synchronize
data. Unlike products that replicate or synchronize *data*, Groove
is instead a distributed *transaction* system that, in essence,
distributes *method calls* as opposed to data. In our architecture,
the high-level components (model or view) create, in essence, XML
procedure call descriptions (much as you'd do in XML-RPC or SOAP).
These calls are treated as atomic transactions that are submitted
locally and distributed securely to other nodes in the same shared
space, where they're executed in parallel. (Of course, the tough job
is ensuring global transaction ordering, particularly in a
disconnectable environment. But I digress...)
All of that said, it is up to the user-level application to determine
precisely what to do with this distributed transaction
infrastructure. For example, one application might use it to
distribute sketchpad stroke directives, such as <draw-line
from="156,234" to="123,54"/>, another might <chat-append text="yo,
whazzzup?"/> or something.
In Groove, the developer might (as we did) build a simple file
sharing tool that moves data by value. Ours uses the standard
Windows explorer control at the UI; when the user drags a file into
the tool, we generate a transaction such as <add-file name="foo.doc"
Another developer might (as we've prototyped) build a simple file
sharing tool that moves data by reference. For example, it could
again use the standard Windows explorer control at the UI, but it
might generate a transaction such as <add-file name="foo.doc"
thinkpad</present-at-endpoints></add-file>. In other words, the UI
(when hovering, for example) could display who has the file, where it
is, etc., and yet another transaction could be issued to fetch the
file. (We support endpoint-directed and role-directed transactions
So ... my weasly way of getting out of your question is to say "have
it your way". Time will tell us which tools are best to move data by
value, which ones are best to move by reference, and what kind of UI
to slap onto the by-reference tools so that you can get the data that
you need when you need it.
(Oh - as you probably expect, we also have a "bot server"
architecture in which you can invite a server to be a proxy member.
It can do things such as systems integration, or could indeed also do
things such as being a proxy file repository. A mere matter of app-
I spotted on Dan Gillmor's "Hailstorm" writeup
<quote>Each separate demonstrator (eBay, Groove, American Express,
among others) had created a tab inside the Microsoft messaging
That surprised me. What was the Groove tab doing? Where do you see
the overlap / integration between Groove and Hailstorm (or with .NET
in more general terms)?
Groove was designed with a philosophy that Groove Networks, Inc., has
no business being a repository for your data unless you explicitly
choose to share it with us. This includes even such basic things as
your identity, your contacts, etc. (We're probably one of the only
systems around that does what you'd typically refer to as "user
awareness" without our servers having to know your identity or your
friends' identities.) It's pretty cool.
That said, even though Groove doesn't require you to list yourself in
some mega-directory in order to use the product, people sometimes
LIKE to be listed in directories, give out their contacts, etc., so
that others can find them to groove with them. It's for this reason
that we put in features such as "invite via email", so that you can
leverage the contact list that you already have in Outlook/etc.
What we did in our Hailstorm demo was conceptually very very simple:
We figured that people might like to 1) identify themselves to Groove
by using their Passport identity, for single-logon purposes and so
that other Passport users can recognize them, 2) store their contacts
inside the centralized Hailstorm contact repository, so that the same
contacts are accessible to Groove, Messenger, the Windows XP shell,
and any other app that wishes to use them, and 3) we integrate a
little applet into the Messenger UI so that you can see at a glance
what shared spaces people are actively working within, etc., in a
manner consistent with the display of the Messenger buddy list, AND
so that in two clicks you can invite someone to a Groove shared space
without ever leaving Messenger. Etc.
The main user benefit of Hailstorm as we're using it is to put the
user in control of their own data in an app-independent and device-
independent fashion. I think that for people who use Messenger
and/or Windows XP, it'll provide a very convenient way to launch into
Groove and to stay aware of what's going on in Groove. Let there be
no doubt that Hailstorm is about the most centralized model that you
can find, whereas Groove is probably pretty close to the other end of
the spectrum. This integration simply shows that you can indeed get
user benefit by "appropriate" integration of the center and the edge.
Said Ray in email:
> The product is structured as both a solution platform andI spent a while wondering about what a solution platform is, and why
> an end-user application
a solution platform would need to exist at all.
I think that a solution platform means a very high-level API, with
templates for network architecture on top of the normal GUI builder.
Groove provides a soup-to-nuts environment, encapsulating the
environment at a higher level than Java or even Windows.
First, there is a lot of creative work at a low level. Persistence
based on local XML storage(*), with transaction support in a non-SQL
environment, with on-disk encryption. The relay hub, with multi-node
coordination and store-and-forward support for transient nodes.
Simple symetrical transmission protocol; a proprietary protocol I
believe. All this stuff together encapsulates the conceptual
Second, there are all the trimmings. A skinnable UI. A bunch of
tightly integrated apps - the IM app, shared spaces app, graphics
app, etc. Ensemble browsing. The relay hub is self configuring.
Components can be repaired or replaced dynamically and, I assume,
remotely. All this stuff together encapsulates the working
Groove simplifies development in a heterogeneous world by making a
homogenous world. There is less need to think about missing
components, mismatched DLLs, firewall configuration, or any other
local oddities, hence multiple nodes can be coordinated more easily.
Is that a reasonable reading, Ray?
A good description, but I take slight issue with the last paragraph.
Yes, we do indeed make development simple for our own environment.
The kind of "eversync" synchronization that we provide is incredibly
difficult to achieve robustly, and it was our goal to create an
environment in which you could build some very useful stuff very
easily that was automatically synchronized across many users and all
of their devices.
But we live in a heterogeneous world (computing and otherwise), and
to view it otherwise would be to have blinders on. From day one, we
designed Groove to fit into a heterogeneous world by embracing
others' components, by integrating first XML-RPC and then SOAP, by
embracing XML bottom-to-top, and so on. We've spent a good deal of
time and effort recruiting and training systems integrators and
consultants. To the best of my knowledge, virtually every Groove
developer that we've been in contact with is doing some form of
systems integration. They don't just do a simple tool; rather, they
build tools in Groove that do something special in conjunction with
their other systems, either via bots or via other methods of
interconnection such as direct client-side SOAP calls.
Based upon what I've seen already, I know that we'll be successful in
building a system that will be of real and substantive value to our
customers (who today are enterprise customers). But I also hope that
ultimately both the architecture and implementation of this radically-
distributed system will cause systems designers and developers to
step back, gasp, and realize that there is truly life after the 70's
OS model, the 80's GUI model, and the 90's Web model. And it's a
(Not to be quoted/forwarded outside of this mailing list without
explicit permission, please. Thanks.)