My last post referenced a movie of a “talk show” in Second Life, prompting John to ask about the relationship of avatar richness to the experience. I think there’s a simple trick that’s worth making explicit.
It’s true that these figures don’t seem very real. They act like zombies and watching them seems to induce a pathetic little trance-like behavior in me. The best high-speed single-user video games are much richer, and big-budget movies are much richer still. At the other end, there are now Flash-based “virtual worlds” with quite stilted graphics and behaviors. On this scale, Croquet and Qwaq Forums are roughly in the same region as Second Life, but within their range today’s SL is frankly richer than Forums.
Video games have long been able to produce an immersive effect with much cruder graphics and behaviors. I think a lot of it is very simply your interactive point of view. For example, studies have used mobile cameras that fed live video into googles. When the camera was attached to the back of a person so that they saw their own head from behind, the participants reported an “out of body” sensation. So I think the whole “trick” is to have a visual field that contains the focus of your attention, and put that visual field under the direct-manipulation control of the user.
The feeling of immersion is from the user-control of the visual field of interest, not from the richness of the visuals themselves.
As a result, you can get a feeling of immersion from the simple 2D graphics of a Nintendo GameBoy. By contrast, there’s a hell of a lot of visual stuff going on in a WebEx or Adobe Breeze meeting, but they still suck. There’s absolutely no feeling of immersion. I think the problem is that your visual interest is all over the screen, and there’s no pairing of that to your control.
In the case of the talking heads video, you’re watching a movie that you have no visual control over. The virtual camera pans around, but not by you. By contrast, I expect that the Second Life folks who were “in” the virtual audience did have a strong sense of immersion, even though they were looking at the same avatars that you see in the movie.
I think a visually fantastic film can be an engaging experience, even if it is on a small TV screen. You can be “immersed” in a radio story. But I think the sense of body immersion only comes when a great cinematographer moves the camera so that the bulk of your visual interest follows exactly as you would yourself if you were present. When the camera doesn’t move the way my eyes want to, I find myself thinking about the movie as a movie, and the effect is lost. Movie theaters and big-screen home theaters make it possible for the cinematographer’s effect to tickle your brain in the right way. (I image that there’s also an audio component to this brain trick. Maybe for sighted people, the visual stimulation lights up a larger proportion of the brain?)
The difference between a meeting and a broadcast is that you interact with people in a meeting. Your attention had better shift to the people you are interacting with. There’s no opportunity to employ the visual trick with a telephone. iChat, NetMeeting, Skype and such all have visually rich live video feeds of users and can be quite engaging, but as long as they have segregated pictures with no fluid control, I believe it may be biologically impossible for them to induce any sense of actually being there. By contrast, you can put the same stupid PowerPoint in a virtual world, and with relatively simple graphics and audio you can have a very effective body-immersed meeting.
Hate to say it, but using PowerPoint is almost a guarantee that the content is of two two conditions – worthless or already covered. If the idea can’t be defined in a paragraph then a 30 slide PP won’t help. If it can be then pass me the spreadsheet so I can see the financial projections. And I want to see the assumptions made that went into the financials..
But hey thet’s just me.
Indeed, PowerPoint isn’t a great tool for drill-down, and examination-in-depth is important. I do not mean to say that sticking a PowerPoint in a virtual world is going to teach that pig to sing.
But a 7 or 15 slide show can be useful guide for running a face to face meeting, regardless of whether those slides are created with red ochre and a cave wall, Nikon and Ektachrome, some outliner tool, or the evil PowerPoint. Simply putting that slide show on the Web makes it available, but that doesn’t make a meeting. Putting the slide show in WebEx or Breeze is, I would argue, a poor shadow of the experience of being in the physical room with the slide show and the people, and a meeting organized around such technology is at a disadvantage before it begins. I think there are tricks that that can make it much more like being there, or even better than being there: representing people as people, having a high degree of direct-manipulation interactivity (for drill down, for asynchronous exploration, for exploring annotations and linked concepts…), having high fidelity sound, visuals, and gestures, and so on. I think it’s useful to separate out what one gains from each so that we can learn what it takes to employ them effectively, and to avoid chasing after irrelevant and maybe even counterproductive features. I think a visually-induced sense of body immersion is one of these tricks.
Howard,
I think your points are interesting. You’re probably right.
When I worked at Laszlo Systems, I did weekly status meetings and code reviews by call-in meeting, and the occasional code review using Breeze.
For the bug scrubs, everybody would be using their own local browser to peek into the same bug database, so we could read bug descriptions, etc. We used irl (internet chat) to pass links to each other. At first we used telephone too, but as I recall we stopped doing that because we wanted to be open to anybody in the world who felt like contributing–it is an open source project, after all. My point is, the main focus of the meeting was in the bug database. That was the logical center.
For code reviews, Breeze + phone worked well enough. The person leading the review would highlight a portion of the code and discuss it. Other peoople would chime in, but you really didn’t need to switch your attention to them, since the code was what you cared about.
Now, for weekly status meetings, etc, we used a conference call and that pretty much sucked. It’s boring, everybody gets distracted because everybody’s giving more than half their attention to some other thing during the meeting (e.g. reading Wetmachine. . .). For situations like that, Qwaq would be better, I’m sure.
Right. Lot’s of things can work well enough, just as you can enjoy a movie even if you don’t get a sense of actually being there. [In fact, it may be important – say, in a documentary – to have a “feeling” of objectivity so that you “think” there are no cheap tricks. But let’s save for another day the discussion of what it means for a human journalist to be objective.]
What tools and tricks you use should be your choice, for the situation at hand. But the cool thing about software technology – or at least, the idealized software technology of the future that I imagine – is that all the tools and tricks are all equally available at the same “cost”. In other words, whether you choose to give your participants an immersive experience should be purely a user choice, not based on availability. This is different than for physical goods, and is based on the idea that it is (or should be) just as easy to make “good” (or capable) software as “bad” (or limited). Of course, we’re not there yet.
The other thing that ought to be possible (and is possible today) is to make use of various tools within a single context. Breeze and WebEx let you listen and text-chat and see a slide and maybe examine a spreadsheet within one tool. Croquet takes that a step further in two dimensions by 1) letting you do arbitrary other things in context that weren’t necessarily explicitly incorporated by the programers of the tool (although this is not really universally true for all arbitrary things within Croquet – yet), and 2) by locating these other things within a 3D setting that makes the relationship between these things obvious and controllable and leads to observable direct manipulation.
Anyway, with regard to having your focus on the highlighted code: I don’t believe you. Maybe you were in town or on-island and could not observe others. But do watch a conference room full of people gathered around a speakerphone, with supplementary material passed out on paper. I promise you that not everyone will be on the same page. I promise you that when someone on the other end of the phone speaks, at least half of the people will actually take their eyes off the page in front of them AND STARE AT THE SPEAKERPHONE. Even more odd, when someone in the room is talking specifically to someone on the other end of the phone, the speaker is likely to actually look at the speakerphone while they speak. We can’t help it. We’re wired that way. (Or maybe we’re just taught to do that. My younger children have the habit of not looking at me when talking to me, particularly when they’re getting in trouble. It also happens when they’re using the “future possible” tense.)
By the way, with respect to:
* Identifying tricks separately so that you can assess their worth, and
* Locating things in-context spatially, and
* That doing so aurally is part of a sense of being there,…
Apple just filed for a patent on doing so for the iPhone. _proposes_acoustic_separation_for_iphone_conference_calls.html” rel=”nofollow”>http://www.appleinsider.com…