Gene Kan & Mike Clary on Sun's Infrasearch Buy
- Very interesting reading over at O'Reilly's OpenP2P.com in this interview of
Gene Kan & Mike Clary on Sun's acquisition of Infrasearch. Sheds (first?)
real light on what Infrasearch has been up to and what JXTA will be about.
Also has gratuitous bagging on the "Intel working group".
- Lindsey Smith pointer to Gene Kan who writes:
> The basic idea is that Infrasearch is able to effectivelyThere is nothing inherent in HTTP, WebDAV, SOAP, or XML that is
> turn all of the computers on a network into a collective
> brain, if you will, in disseminating the information that is
> available on each of those computers. And that's
> something that's really unique when compared to the
> World Wide Web. On the Web, the hosts of
> information are in fact treated as second-class citizens
> when it comes to answering requests based upon the
> information that is located on each Web host.
essentially client-server. It just so happens that the most widely
deployed software architecture of the Web is client-server, but that
doesn't mean that it can't be peer to peer also. If the host
of the information is a Web server running on a "client", it's
not a second class citizen--it's a first class Web server. Even
more desirable is if you make that Web server not just a server,
but a writer/service. You use existing Web protocols to provide
a two-way, writable, first-class citizen.
> And byThere's a difference between Web crawling and Web (content) indexing.
> that I mean that the information that is residing on each
> host must first be interpreted by a crawler and so on
> before any kind of questions can be answered about
> that information.
This again is describing a traditional, centralized Web search engine. There's
no reason why you can't decentralize the indexing. In fact, one of the
best approaches is to take the best p2p search techniques and add
in decentralized indexing--making every peer a search engine indexer.
Our p2p search consists of pushing metadata to a central repository where that
metadata is, in addition to the Napsteresque filename metadata, indexed
keywords and search criteria. It's combined Web search with p2p search.
>You tie in the Web server with a dynamic naming service such that
> That doesn't work in a peer-to-peer world, for at least
> two reasons. The first is that peer networks are
> extremely transient
any "client" or peer can use a URI to determine whether or not
that Web server is running and (even better) the name to that Web
server is fixed regardless of if it's at work behind a firewall, at
home on a DSL connection, or on the road using a dialup. So, put a check
next to transient column.
> and the information available onPush the metadata up a) every time it reconnects to the network
> those networks is constantly changing not only
> because the computers are appearing and disappearing
> all of the time, but because the information itself is
> changing at a much more rapid rate.
(either automatically or by choice), and/or b) every time it
changes through any file/resource operation. Put a check next
to the rapidly changing and disappearing column.
> And the secondWhat's more first class than a Web server? You can plug
> thing is that on a peer network it's important to treat
> every host as a first-class information provider, because
> the key idea behind peer computing is that each node in
> the network has the possibility to make a very
> important contribution to the network as a whole.
almost anything into it, cache, proxy, authentication, registration,
security, e-commerce, collaboration apps, databases, etc., etc. When
you get past wanting to just provide information to wanting to
be able to provide writable information, security, e-commerce,
filtering, caching, database integration, collaboration apps, etc, etc.
you are going to need some serious technologies. 30 million Web
servers and 400 million Web clients can't be *that* wrong can they?
The implementations of such are not only scalable to big
centralized things--they scale really well to super-small things. There's
at least two dozen web server on a match tip projects out there.
So, have you guys read Bill Joy's 6 Webs argument? Maybe he
should add the P2P Web to the list as the 7th.