Maybe we don’t have to choose between persistence and spontaneity, between synchronous and asynchronous.
Consider threaded blogs vs live chat. The former has a permanent record and structure, and people can collaborate asynchronously, but lacks spontaneous give and take. The latter is spontaneous, but lacks permanence (unless someone saves a transcript, and manages their collection of transcripts, and the sharing of same with others). But do these have to be different mechanisms? Maybe we can have everything with a single interface, so people don’t have to choose?
For example, what if threaded discussion was live? Suppose you could see each message part as it is added, or even as each character is typed. People could still examine the threads later and add to them in the usual way. Maybe the UI could let you focus on some part of the thread – breaking off into a separate subconversation even as the main conversation continued at the same time. This is just like having a side conversation at a party.
Web technologies can’t really support this. HTTTP is based on making a request and getting an answer. Chat is a separate application that works differently, though the current Chat implementations can’t really connect large numbers of people into a single chat the way Web-server-based threaded discussion applications can, But there’s no reason to limit ourselves to that. Constructing HTTP Web applications is not childishly easy in any case. Trying to force it to accomplish this simple mix of threaded conversation would really be a kludge, and just plain silly. I’d like to shoot for an environment in which creating applications is childishly easier, and in which it is actually easier to create a good application than a silly broken one.
Before we look at how this would be done, let’s add one more wrinkle. Why limit the conversation to text? Text can be nicer than voice because it can be absorbed without disturbing those nearby. It can also be nicer than voice because it can be searched and scanned easily by both people and computers. I believe these limitations can and will be overcome – by earphones and by summarization based on speech recognition. And maybe these issues aren’t important in the first place. Say, to, a non-reading child or an adult in a car or on a walk in the woods. I think that even as we can have both synchronous spontaneity and asynchronous permanence, we can have both text and speech interfaces.
So, how would this work? We would need
1. a way for anyone to create their own space to meet, with access that they control (or leave open)
2. a way to refer to that space for invitations
3. a way to represent a person
4. a way to deliver invitations to people, delivered immediately to folks online, but still eventually presented to those offline
5. a way to represent ideas, created easily and spontaneously as text or voice or video, or whatever media or other object you like
6. a way to attach one idea to another
7. a framework for building new kinds of ideas (including songs, presentations, virtual whiteboards…), or new ways of creating them (including, perhaps, as text from speech, or the other way around), or new ways of looking at them (including visual and UI design, and skinning)
8. a way to discover new kinds of ideas, new ways of creating them, or new ways of looking at them, and using those new things as you wish
Is that it? Have I captured the fundamental ideas of social discourse? What do you think?
If this is done in a way that anyone can use (e.g., a child), then anyone can partake in any social discourse (with permission, if the discourse is so protected). If people add ways to interact with this from truly mobile computing devices, then any social discourse can be accessed at any time from any place.
In Croquet, we’re still working on access control and wide-area collaboration. Heck, we’re barely getting started! Modulo these, we have already at least prototyped all but 3, though not in the most general or extendible way, and we haven’t put it together. Hmm, maybe it’s time to build a proof of concept demo.
Reactions?
There’s a whole bunch of stuff that deal with very small subsets of the above: telephones, line-open/push-to-talk “phones” (aka walkie-talkies), party lines/conference-calls/CB’s, video conferencing, social software, threaded discussion software, chat, video chat, VoIP software, file sharing, conference software, distance learning, collaboration software, email, newsgroups,… I’m not sure where it ends! It might be nice to have software that bridges between these and the dream I’m describing here. For example, the chat interface built into Mac OS seamlessly talks to folks on AIM, Jabber, MSN, ICQ, and Yahoo. It’s all the same concepts, even if the implementation is different. The Mac software just takes your designation of what your friend is using and handles the yucky stuff without forcing you to start up some separate software. At this point, I’m not too concerned with adoption of Croquet. (It’s a science project, not big VC-backed thing.) So, for example, it wouldn’t be high on my list to build a bridge between this and the various chat services. Email, on the other hand, might be worth it, and might be used for the offline invitations…
On the plane to this conference, I was trying to think of a name for this. There’s a presentation tomorrow by HP on something called Media Messenger. I wonder if that’s about this? My boss tells me that what I’ve been discussing goes back to an old idea by Alan Kay called meta media. Can’t wait for the conference to start!
Two thoughts: I’ve never been a hardcore protocol guy & as you and Gary can attest I’m really stunningly technologically retarded given what I do for a living. But even I have come up against the hard limits of HTTP. Something is going to have to give before the N+2 level of web application greatness can arrive.
(N+1, I really believe, is going to be RIAs, with Laszlo leading the way.)
Secondly, I think you’re going to have to come up with some nifty real-time navigation heuristics, or else the conversation will devolve into what my 17 year old daughter does: 5 or 7 chat windows going at once, in each of which the conversation is inane because everybody is scrambling to get in a quick, quick message before the current one scrolls off the screeen. In this regard I remember seeing recently a cursor-driven word completion program — maybe this was a post of yours here on wetmachine? I forget — that used statistical inference to present choices for next letters according to the likelihood that they came next. The effect was not unlike sledding down a hill.
You’ll need to find some kind of similar sledding navigation, or all you’ll end up with is a debased discourse; that’s my guess.
Stumbled upon this blog today
http://www.fiveacross.com/b…
And thought I found some guy channeling Howard Stearns.
A good writer and thoughtful observer of the types of things Howard discusses in the Inventing the Future series. The guy does seem to have an ego, but that’s allright. Blog writers who don’t have strong opinions can be pretty boring. Ande besides, he’s a CEO, so whaddayou expect?
Remind to write about user-interfaces as the new AI field of application (and funding), and what I saw to that end in Kyoto. But one tidbit for now: there’s a huge amount of context available in an immersive virtual space that was created specifically for a task by the users of that task.