intregration with document-oriented applications

How do we integrate Croquet with the Web? How do we integrate with legacy applications in general?

We interact with computers now in a document model developed by Alan Kay’s Xerox PARC team a long time ago. (Xerox: The Document Company.) It is as is if we have our head bent over our desktop, looking at a piece of paper. We slide other pieces of paper in and out below the face of our bowed head. In Croquet, Kay’s team today lets us lift our head up off the desk and look up at the world around us, including our coworkers. But just as the 3D world has paper within it, shouldn’t the Croquet world have document-based software within it? Yes!

One way we have been thinking to do this is to have flat panels within our immersive 3D Croquet. Each panel displays someone’s desktop and all the document-oriented applications available on it. Each simultaneous visitor to a Croquet space can see and interact with each panel (with permission). They can each click within it and type at it exactly as if it were their own computer screen. We’ve already built this in two ways. In one version, the “desktop” is that provided by the Squeak desktop on which Croquet is built. This is relatively easy because it’s all the same system: the “external” system has close communication to Croquet, and is all right there on each of the machines in the Croquet collaboration. Because each member of a Croquet collaboration has Squeak, all the computation can be replicated on each machine and so we don’t have to transfer big changing bitmaps over the network. Squeak has a lot of tools for displaying text, Flash, MPEG movies, HTML pages, and other media. In another version of a desktop, we have built an X Window server on the network, and some Croquet machine (or all of them) acts as its client display. The network X display data is sent to each Croquet machine, which puts it up on a flat panel within the 3D Croquet environment. (I’m skipping over the question of whether each Croquet machine acts as an independent X client of the server, or whether there is one Croquet super machine that acts as the X client but multiplexes all the data to and from the other Croquet clients.) Our plan is to change this to use VNC instead of X, so that the desktop host can be any machine running as a VNC server, which includes Windows, Unix, Linux, & Macintosh.

The idea in each case is that whatever can be run on the desktop is available to all users of a given Croquet collaboration. They don’t need to have the software locally, but they can use it as though it were local. This is something like PC-anywhere, except that you get to do other stuff at the same time as accessing the foreign machine, and that a whole bunch of people get to access the foreign applications at the same time. Even though Word isn’t designed to be used collaboratively, all the people in the Croquet space in which Word is being displayed can click on the page and type. Everyone sees the real-time result of what users do. (No more e-mailing versions back and forth between people.) The same cooperative taking-turns rules apply as in the real world. It’s just a convenience that each machine has their own keyboard so users don’t have to pass one back and forth. That would be a pain if they’re physically co-located, and much harder if they’re on different continents. So if there is an MP3 player on some machine somewhere, and it has access to some song, then all the Croquet machines that have access to the first machine can hear the music simultaneously without anyone directly having either the song or the player.

I imagine that licensing and intellectual property folks will have some opinions on this.

Actually, I think both X and VNC deal with video bits, but not sound bits, which is odd. Well, we’ll figure it out.

Anyway, what Croquet gives you is the collaborative context.

If I want to create a multimedia experience on the Web, I have to use some program to arrange the different bits of pictures and video and text onto a series of HTML pages. I have to provide a way for people to get from one page (a grouping) to another. I will typically go through various cycles of visual and navigational design. I have to arrange for some server somewhere and copy the pages to it. If there’s to be a lot of users, there are load issues to sort out. And if I want to involve other people in the process…. yuck. Finally, each visitor to the site will see the experience alone, sharing it with no one.

By contrast, any Croquet user, without any training, can grab these desktop media-displaying panels and arrange them however they want in 3D. Colleagues can help – or hinder – in realtime, and talk to each other as they do so. When they’re done, they just let people in to the space. No need to copy stuff around. The Croquet infrastructure will provide the scalable technical stuff. When people view the experience, they can see and interact with each other just like the people who created it. If given permission by the creators, they can annotate it, or stick, say, a transparent panel over the top of something and draw over. In general, they can continue the collaborative creative process.

This isn’t a perfect integration of Croquet and the Web. The problems stem from the fact that Croquet is treating the foreign desktop as a black box. When annotating something done this way, for example, all you can really annotate is the window that holds the foreign desktop. You can’t reach inside and annotate the documents that are on that desktop. To be able to do that, we need to represent the documents and paragraphs and such directly in Croquet. OK. A quick & dirty way to do this is to create a separate flat panel for each resource we want to annotate. For example, if your machine has a Word and Excel file of interest, and you know of a Web page of interest, you can create three separate flat panels that access your machine’s desktop. In one you bring up the Word document, in another you bring up the spreadsheet, and in the third you open a browser and visit the Web page. (VNC servers vary. Depending on the capabilities of the VNC server on the foreign machine, you might be able to do this with one physical foreign machine, or you might need three.) Maybe using the Croquet security model, you revoke the ability of anyone else to make any changes to these panels other than, say, scrolling.

OK, that’s pretty good, but suppose I want to work with the individual paragraphs within a document. I’m creating a component model in which people can create new Croquet content by simply grabbing the pieces they like (with permission) from other Croquet content in other spaces. The “pieces” they grab might be pictures, sound, text, textures, shapes, and even behaviors. Well, I wouldn’t expect to be able to grab behaviors from today’s Web, but I might like to grab some text or pictures. Yes, I could access these through a browser in a desktop flat panel display, and ask the foreign browser to save them to a file on the desktop machine. Then in Croquet I could import that file (using ftp or some such if the file is remote). But that’s three different models and related abstractions, which is inconvenient at best, and probably beyond the patience of most people who are just trying to get something done, not futz around with the damn computer. This totally goes against the direct manipulation feel that I’m going after with the component interface. It also forces people to deal with file names and other abstractions that I’ve argued against. Maybe the VNC or other foreign desktop mechanism can access a rich media cut and paste buffer on the host, and maybe Croquet can access the buffer from the flat panel display. Maybe. That still forces the user to manage more invisible abstract manipulation than I would like.

Another possibility is that we could create a modern rich media browser within Croquet. For example, we could start with Firefox or similar, but arrange for the rendering of elements to be as individual Croquet components. These could then be directly dragged around or annotated as desired. But I imagine that Croquet/Componentizing Firefox would be a lot of work. I’m hoping that I’m just worrying about a problem that doesn’t really need to be solved. Maybe the VNC and file system thing is good enough?

[Here’s two examples I like in which the “engineer’s” solution was not necessary. 1. Programmer deals with strings of characters and in the past have assumed that searches need to distinguish case. But no one could come up with a good way to specify and control case sensitivity for Web searches that made sense to non-programmers. Google found that it was unnecessary. Searches are case insensitive and it doesn’t matter. 2. People who worked on distributed information systems in the ‘80’s thought that links needed to be bidirectional. You needed to see who linked to you, but maintaining such links in a distributed system was a huge engineering problem. In creating the WWW, Berners-Lee just worried about links from the page you controlled and didn’t care about links to you. Much simpler, and it’s fine.]

About Stearns

Howard Stearns works at High Fidelity, Inc., creating the metaverse. Mr. Stearns has a quarter century experience in systems engineering, applications consulting, and management of advanced software technologies. He was the technical lead of University of Wisconsin's Croquet project, an ambitious project convened by computing pioneer Alan Kay to transform collaboration through 3D graphics and real-time, persistent shared spaces. The CAD integration products Mr. Stearns created for expert system pioneer ICAD set the market standard through IPO and acquisition by Oracle. The embedded systems he wrote helped transform the industrial diamond market. In the early 2000s, Mr. Stearns was named Technology Strategist for Curl, the only startup founded by WWW pioneer Tim Berners-Lee. An expert on programming languages and operating systems, Mr. Stearns created the Eclipse commercial Common Lisp programming implementation. Mr. Stearns has two degrees from M.I.T., and has directed family businesses in early childhood education and publishing.


  1. In my neck of the woods a recent event of note was the announced aquisition of Macromedia by Adobe — which relates directly to your discussion here.

    I found this blog about Macromedia/Adobe that nicely mirrors yours:

  2. Carl tried to add the following comment and had technical difficulty:
    If I am creating a presentation using VNC and different sources (machines) for each part of the presentation, isn\’t this going to prevent me from using the machine for anything else while it is serving the document in question? Or does VNC allow one to server just one application instead of the entire desktop?

    Also, using Croquet how do I set permissions to as you mention permit scrolling only? Isn\’t that functionality under the control of application being served by VNC?
    We can’t fix Windows for you. However, running VNC on a real operating system (like Unix) allows you to have multiple “sessions” that each have different desktops. This is what we do now in the Croquet Collaborative.

    I think of VNC as punching a whole in the fabric of Croquet to reach legacy stuff. In this case, the legacy application has it’s old security — whatever that might be, and VNC is multiplexing everyone’s input onto it. Effectively, everyone is sharing the same security for the remote desktop. But there is a way in which Croquet might be able to fix broken legacy stuff by magic. We’re working on ways to have users interact with stuff through an “interactor” that might be different (with different capabilities) for different users. This would allow us to control which users see things through the interactor, and which inputs are allowed to pass through. Since the VNC display and inputs pass through Croquet, that would allow us to put controls on that VNC and the application itself do not. However, this interactor-based security is still just in the discussion stage.

Comments are closed