When something new comes along, we tend to describe what it is. If it’s something important, it takes a while to figure out why it’s important – what it is that is really different. The description of what something is tends to be somewhat dry and technical and it misses the point. For example, a telegraph is an encoder and a decoder in an electric circuit. But couriers and semaphores involve coders and decoders, and other stuff has had electric circuits. What was important about the telegraph was that it provided instantaneous long-distance communication. This is also what was important about its successors like the telephone and radio, even though the descriptions of what each is are quite different than that of the telegraph. It’s not as simple as describing what a new invention does for people. Quite often we don’t know how it will be used.

Since I first heard about Croquet, I’ve been trying to figure out what is really important about the immersive 3D that everyone first notices about it. I think I now have an idea. It turns out that the “immersive” part is key.

Other stuff has 3D. Games often have 3d displays of the action, even if they don’t put you in the center of it. Same for many kinds of simulations. But why would you want your operating system to be like that? Some folks like the idea of using game metaphors for managing your computer. You kill processes, perhaps by shooting them. So what? I don’t even enjoy playing video games. Well ok, a more intuitive, perhaps even more fun interface is a darn good thing, and that’s what makes the telephone better than the telegraph. The telephone has had a bigger impact than the telegraph because usability and scalability (partially a function of usability) are important. But what really fundamentally mattered in both was the breakthrough of instantaneous communication, not incremental usability benefits.

Before Jasmine was available to download, I tried out some 3D desktop environments. Instead of a flat desktop, these toys let you keep your desktop icons in a 3d space. You can move around in the space and position those icons anywhere within that space. So you can fit a lot more crap on your screen because you can layer the stuff in 3D. You have a closet instead of a desktop. Big deal. The extra inconvenience in reaching for stuff isn’t worth the added space.

In developing the Brie UI framework for Croquet, I’ve been forced to think about how Croquet is different from the overlapping window interface in Windows or Mac OSX. I hadn’t realized that a lot of the design of these desktops is built around frames that hold the application components, and how these frames are divided by application. (These frames are called “windows”, but you don’t look through them and you don’t open them to let stuff through. What an odd name!) Maybe you can manipulate stuff within a window. Very occasionally, in sophisticated applications in sophisticated windowing systems, you might be able to move something between one window and another, as long as both windows are of the same application. But there’s no way you can move some application component from one application to another. Oh sure, sometimes you can sort of fake it by dragging an icon from the desktop to an application, or even between applications, but it’s a hollow manipulation. You’re not really combining application objects from one to the other. At best, one application serializes some aspect of the object, the bits are copied, and the other application deserializes the shadow.

The design of some operating systems prevents – or at least makes very difficult – the direct transfer of objects between applications. But, for example, the vaunted Lisp machine had a single memory space for all applications, yet it still had a framed Window interface (before the Macintosh!), segregated by application. And it didn’t generally allow you to intermix components between applications through the UI. You could do so directly in Lisp, but there was no 3D UI that let you walk around surrounded by objects from multiple applications. There were applications with 3D windows, but there was no 3D between the applications.

The effect of the flat window UI is that you can’t immerse yourself in the application. You can only look in on it as though through a window. (So maybe the name is more subtly appropriate than I had realized.) There aren’t many applications that are significantly hampered by this in themselves. Maybe simulations would be better if you could walk around amongst the actors and observe things from different points of view. But by and large the applications we use on computers are those for which the lack of immersion is not a deal-breaker.

Immersion is not fundamentally important for what happens within an application, but for what happens between applications. You don’t just immerse your virtual self in one application and walk around among its objects. You immerse yourself in several applications and walk around among all the objects simultaneously. These object are not separated by windows and they don’t have title bars. You can’t even necessarily tell what objects belong to what application. And so you start to bring different things together. A presentation is a virtual stroll through a combination of artifacts that were produced by entirely different applications. A learning environment can combine any media in any way. Even objects that interact with other objects (such as in a simulation) can be asked to do so with objects that came from different applications.

TeaTime creator David Reed describes Croquet as enabling cooperation at all levels. I had at first thought of this in terms of the network cooperation between peers, the temporal cooperation between Croquet worlds, and then I jumped to the touchy-feely cooperation of human beings working in a collaborative environment. I am now beginning to realize that the immersive environment enables cooperation between applications. And remember that these applications are distributed over the Internet as active worlds waiting to be explored. This is as profound as the cooperation – the intercourse, the commerce – enabled by immersing cars and trucks and all manner of people in a road and highway system. By contrast, current Internet object communication protocols are more like a standard railway gauge. This cooperation is good, particularly if you can position yourself to sufficiently benefit from an engineered solution restricted to the places you can run track. Immersive and undifferentiated cooperation is uncentralized and free and is mathematically far more scalable.