What Is It About Immersive 3D?

When something new comes along, we tend to describe what it is. If it’s something important, it takes a while to figure out why it’s important – what it is that is really different. The description of what something is tends to be somewhat dry and technical and it misses the point. For example, a telegraph is an encoder and a decoder in an electric circuit. But couriers and semaphores involve coders and decoders, and other stuff has had electric circuits. What was important about the telegraph was that it provided instantaneous long-distance communication. This is also what was important about its successors like the telephone and radio, even though the descriptions of what each is are quite different than that of the telegraph. It’s not as simple as describing what a new invention does for people. Quite often we don’t know how it will be used.

Since I first heard about Croquet, I’ve been trying to figure out what is really important about the immersive 3D that everyone first notices about it. I think I now have an idea. It turns out that the “immersive” part is key.

Other stuff has 3D. Games often have 3d displays of the action, even if they don’t put you in the center of it. Same for many kinds of simulations. But why would you want your operating system to be like that? Some folks like the idea of using game metaphors for managing your computer. You kill processes, perhaps by shooting them. So what? I don’t even enjoy playing video games. Well ok, a more intuitive, perhaps even more fun interface is a darn good thing, and that’s what makes the telephone better than the telegraph. The telephone has had a bigger impact than the telegraph because usability and scalability (partially a function of usability) are important. But what really fundamentally mattered in both was the breakthrough of instantaneous communication, not incremental usability benefits.

Before Jasmine was available to download, I tried out some 3D desktop environments. Instead of a flat desktop, these toys let you keep your desktop icons in a 3d space. You can move around in the space and position those icons anywhere within that space. So you can fit a lot more crap on your screen because you can layer the stuff in 3D. You have a closet instead of a desktop. Big deal. The extra inconvenience in reaching for stuff isn’t worth the added space.

In developing the Brie UI framework for Croquet, I’ve been forced to think about how Croquet is different from the overlapping window interface in Windows or Mac OSX. I hadn’t realized that a lot of the design of these desktops is built around frames that hold the application components, and how these frames are divided by application. (These frames are called “windows”, but you don’t look through them and you don’t open them to let stuff through. What an odd name!) Maybe you can manipulate stuff within a window. Very occasionally, in sophisticated applications in sophisticated windowing systems, you might be able to move something between one window and another, as long as both windows are of the same application. But there’s no way you can move some application component from one application to another. Oh sure, sometimes you can sort of fake it by dragging an icon from the desktop to an application, or even between applications, but it’s a hollow manipulation. You’re not really combining application objects from one to the other. At best, one application serializes some aspect of the object, the bits are copied, and the other application deserializes the shadow.

The design of some operating systems prevents – or at least makes very difficult – the direct transfer of objects between applications. But, for example, the vaunted Lisp machine had a single memory space for all applications, yet it still had a framed Window interface (before the Macintosh!), segregated by application. And it didn’t generally allow you to intermix components between applications through the UI. You could do so directly in Lisp, but there was no 3D UI that let you walk around surrounded by objects from multiple applications. There were applications with 3D windows, but there was no 3D between the applications.

The effect of the flat window UI is that you can’t immerse yourself in the application. You can only look in on it as though through a window. (So maybe the name is more subtly appropriate than I had realized.) There aren’t many applications that are significantly hampered by this in themselves. Maybe simulations would be better if you could walk around amongst the actors and observe things from different points of view. But by and large the applications we use on computers are those for which the lack of immersion is not a deal-breaker.

Immersion is not fundamentally important for what happens within an application, but for what happens between applications. You don’t just immerse your virtual self in one application and walk around among its objects. You immerse yourself in several applications and walk around among all the objects simultaneously. These object are not separated by windows and they don’t have title bars. You can’t even necessarily tell what objects belong to what application. And so you start to bring different things together. A presentation is a virtual stroll through a combination of artifacts that were produced by entirely different applications. A learning environment can combine any media in any way. Even objects that interact with other objects (such as in a simulation) can be asked to do so with objects that came from different applications.

TeaTime creator David Reed describes Croquet as enabling cooperation at all levels. I had at first thought of this in terms of the network cooperation between peers, the temporal cooperation between Croquet worlds, and then I jumped to the touchy-feely cooperation of human beings working in a collaborative environment. I am now beginning to realize that the immersive environment enables cooperation between applications. And remember that these applications are distributed over the Internet as active worlds waiting to be explored. This is as profound as the cooperation – the intercourse, the commerce – enabled by immersing cars and trucks and all manner of people in a road and highway system. By contrast, current Internet object communication protocols are more like a standard railway gauge. This cooperation is good, particularly if you can position yourself to sufficiently benefit from an engineered solution restricted to the places you can run track. Immersive and undifferentiated cooperation is uncentralized and free and is mathematically far more scalable.

About Stearns

Howard Stearns works at High Fidelity, Inc., creating the metaverse. Mr. Stearns has a quarter century experience in systems engineering, applications consulting, and management of advanced software technologies. He was the technical lead of University of Wisconsin's Croquet project, an ambitious project convened by computing pioneer Alan Kay to transform collaboration through 3D graphics and real-time, persistent shared spaces. The CAD integration products Mr. Stearns created for expert system pioneer ICAD set the market standard through IPO and acquisition by Oracle. The embedded systems he wrote helped transform the industrial diamond market. In the early 2000s, Mr. Stearns was named Technology Strategist for Curl, the only startup founded by WWW pioneer Tim Berners-Lee. An expert on programming languages and operating systems, Mr. Stearns created the Eclipse commercial Common Lisp programming implementation. Mr. Stearns has two degrees from M.I.T., and has directed family businesses in early childhood education and publishing.


  1. As to describing technologies vs. describing their importance. I remember a funny scene from Tracy Kidder’s “Soul of a New Machine” (about the Data General MV8000). After a gruelling long work stint, some of the fellows go out for beers at a local dive, and Kidder asks them to get philosophiccal about “what is a computer.” They finally agree that a computer is “a device that can successfully execute the Eclipse machine set diagnostic.”

    I still think that’s laugh-out-loud funny, and a pure example of geek humor. It’s the kind of thing that non-geeks don’t even realize is a joke.

    (My first job in the computer biz was at Data General in 1980, where I worked very briefly on the first documentation set for the MV8000. . .)

    As to cooperating applications, it reminds me of the stuff that Geoff Arnold was talking about on that evening that he and I went to dinner with Hofstadter and Dennett, as described in and earlier post, the link for which I’m too lazy to post (“Mindful of Philosophy”).

  2. Note to self: figure out where Nucleus allows has settings for turning on or off the computer’s discretion for turning punctuation into smiley faces.

    Turn it off.

  3. Yeah, yeah, yeah. OK, I turned off the smilies.


    Don’t you want Wetmachine to be a happy place?

  4. > Immersive and undifferentiated
    > cooperation is uncentralized and free
    > and is mathematically far more scalable.

    But our minds & memory <b>don’t</b> scale that way. Human memory is the “final frontier”. That’s why a critical understanding of the cognitive limits of human memory is essential for the Croquet interface and architecture of the Croquet environment. Because it’s integrated all the way down, legacy interfaces will simply not be enough.

  5. That’s part of the reason I suggest three superimposed planes that fill the entire screen interface, & alpha blended as needed:


    1. A grid plane allows for pigeon holes to place what we wish to remember and is presented more linearly for both subject mater topic and chronologically. Its more like a text world. This plane is also for describing the “rules” of the world, its “limits” of what it’s capable of, and the “intentions” for the objects that inhabit it.

    2. The second is a 2D graphics plane for more art & graphics, eToys, etc. Both this plane and the grid plane provide something of the “heads-up” console effect. Object in the 2D plain also have various representations in the text/grid plane. The 2D plane is more like a personal interface than a document to be shared. The grid properties and/or the 3D plane describe what is shared.

    3. The third plane is the 3D plane which is for presence and immersive, simulation experiences.

    The environment runs with all three planes interlinked by references, representations, dynamic interactivity, alpha blended views that views show just what’s important at that moment, etc.

  6. By “grid properties” I mean “grid cell properties”.

Comments are closed