Howard Stearns works at High Fidelity, Inc., creating the metaverse. Mr. Stearns has a quarter century experience in systems engineering, applications consulting, and management of advanced software technologies. He was the technical lead of University of Wisconsin's Croquet project, an ambitious project convened by computing pioneer Alan Kay to transform collaboration through 3D graphics and real-time, persistent shared spaces. The CAD integration products Mr. Stearns created for expert system pioneer ICAD set the market standard through IPO and acquisition by Oracle. The embedded systems he wrote helped transform the industrial diamond market. In the early 2000s, Mr. Stearns was named Technology Strategist for Curl, the only startup founded by WWW pioneer Tim Berners-Lee. An expert on programming languages and operating systems, Mr. Stearns created the Eclipse commercial Common Lisp programming implementation. Mr. Stearns has two degrees from M.I.T., and has directed family businesses in early childhood education and publishing.

Discovery Part 2: How

 “The Universe is made of stories, not of atoms.” -Muriel Rukeyser

Last time, I discussed why we offer suggested locations to teleport to, and the 5 W’s of the user interaction for suggestions. This time I’ll discuss how we do that, with some level of technical specificity.

Nouns and Stories

Each suggestion is a compact “story” of the form: User => Action => Object. Facebook calls these “User Stories” (perhaps after product management language), and linguists refer to “SVO” sentences. For example, Howard => snapshot => Playa, which we display as “Howard took a snapshot in Playa”. In this case Playa is a Place in-world. The Story also records the specific position/orientation where the picture was taken, and the picture itself. Each story has a unique page generated from this information, giving the picture, description, a link to the location in world, and the usual buttons to share it on external social media. Because of the metadata on the page, the story will be displayed in external media with the picture and text, and clicking on the story within an external feed will bring them to this page.

snapshot-howard-playa

The User is, of course, the creator of the story. The user has a page, too, which shows some of their user stories, and any picture and description they have chosen to share publicly. If allowed, there’s a link to go directly to that user, wherever they are.

user-card-howard

For our current snapshot and concurrency user stories, the Object is the public Place by which the user entered. More generally, it could be any User, Place, or (e.g., marketplace) Thing. These also get their own pages.

place-card-playa

The “feed” is then simply an in-world list of such Stories.

Control

Analogously to any computer on a network and registering with ICANN, a High Fidelity user may create places at an IP address or even a free temporary place name, or they can register a non-temporary name. Places are shown as suggestions to a user only when they are explicitly named places, with no entry restrictions, matching the user’s protocol version. (When we eventually have individual user feeds, we could consider a place to be shareable to a particular user if that logged in user can enter, rather than only those with no restrictions.)

Snapshots are shown only when explicitly shared by a logged-in user, in a shareable place.

snapshot-review

Scale

At metaverse scale, there could be trillions of people, places, things, and stories about them. That’s tough to implement, and tough for users to make use of the firehose of info. But now there isn’t that many, and we don’t want to fracture our initial pioneer community into isolated feeds-of-one. So we are designing for scale, but building iteratively, initially using our existing database services and infrastructure. Let’s look first at the design for scale:

First, all the atoms of this universe – people, places, things, and stories – are each small bags of properties that are always looked up by a unique internal identifier. (The system needs to know that identifier, but users don’t.) We will be able to store them as “JSON documents” in a “big file system” or Distributed Hash Table. This means they can be individually read or written quickly and without “locking” other documents, even when there are trillions of such small “documents” spread over many machines (in a data center or even distributed on participating user machines). We don’t traverse or search through these documents. Instead, every reason we have for looking one up is covered by something else that directly has that document’s identifier.

(There are a few small exceptions to the idea that we don’t have to lock any document other than the one being looked up. For example, if we want to record that one user is “following” another, that has to be done quite carefully to ensure that a chain of people can all decide to follow the next at the same time.)

There are also lists of document identifiers that can be very long.  For example, a global feed of all Stories would have to find each Story one or more at a time, in some order. (Think of an “infinitely scrollable” list of Stories.) One efficient way to do that is to have the requesting client grab a more manageable “page” of perhaps 100 identifiers, and then look up the document on however many of those fit on the current display. As the user scrolls, more are looked up. When the user exhausts that set of identifiers, the next set is fetched. Thus such “long paged lists” can be implemented as a JSON document that contains an ordered array of a number of other document identifiers, plus the identifier for the next “page”. Again, each fetch just requires one more document retrieval, looked up directly by identifier. The global feed object is just a document that points to the identifier of the “page” that is currently first.  Individual feeds, pre-filtered interest lists, and other features can be implemented as similar long paged lists.

However, at current scale, we don’t need any of that yet. For the support of other aspects of High Fidelity, we currently have a conventional single-machine Rails Web server, connected to a conventional Postgres relational database. The Users, Places, and Stories are each represented as a data table, indexed by identifier.  The feed is a sorted query of Stories.

We expect to be able to go for quite some time with this setup, using conventional scaling techniques of bigger machines, distributed databases, and so forth.  For example, we could go to individual feeds as soon as there are enough users for a global feed to be overwhelming, and enough of your online friends to have High Fidelity such that a personal feed is interesting, This can be done within the current architecture, and would allow a larger volume of Stories to be simultaneous added, retrieved, scored, and sorted quickly.  Note, though, that we would really like all users to be offered suggestions — even when they choose to remain anonymous by not logging in, or don’t yet have enough experience to choose who or what to follow. Thus a global feed will still have to work.

Scoring

We don’t simply list each Story with the most recent ones first. If there’s a lot of activity someplace, we want to clearly show that up front without a lot of scrolling, or a lot of reading of place or user names. For example, a cluster of snapshots in the feed can often make it quite clear what kind of activity is happening, but we want the ordering mechanism to work across mixes of Stories that haven’t even been conceived of yet.

Our ordering doesn’t have to be perfect – there is no “Right Answer”. Our only duty here is to be interesting. We keep the list current by giving each Story a score, which decays over time. The feed shows Stories with the highest scores first. Because the scores decay over time, the feed will generally have newer items first, unless the story score started off quite high, or something bumped the score higher since creation. For example, if someone shares a Story in Facebook, we could bump up the score of the Story — although we don’t do that yet.

Although we don’t display ordered lists of Users or Places, we do keep scores for them. These scores are used in computing the scores of Stories.  For example, a snapshot has a higher initial score if it is taken in a high scoring Place, or by a high scoring User. This gives stories an effect like Google’s page ranking, in which pages with lots of links to them are listed before more obscure pages.

To keep it simple, each item only gets one score. While you and I might eventually have distinct feeds that list different sets of items, an item that appears in your list and my list still just has one score rather than a score-for-you and different score-for-me. (Again, we want this to work for billions of users on trillions of stories.)

To compute a time-decayed score, we store a score number and the timestamp at which it was last updated.  When we read an individual score (e.g., from a Place or User in order to determine the initial score of a snapshot taken in that Place by that User), we update the score and timestamp.  This fits our scaling goals because only a small finite number of scores are updated at a time. For example, when the score of a Place changes, we do not go back and update the scores of all the thousands or millions of Stories associated with that Place. The tricky part is in sorting the Stories by score, because sorting is very expensive on big sets of items. Eventually, when we maintain our “long paged lists” as described above, we will re-sort only the top few pages when a new Story is created. (It doesn’t really matter if a Story appears on several pages, and we can have the client filter out the small numbers of duplicates as a user scrolls to new pages of stories.) For now, though, in our Rails implementation, a new snapshot causes us to update the time-decayed score for each snapshot in order, starting from what was highest scoring. Once a story’s score falls below a certain threshold, we stop updating.  Therefore, we’re only ever updating the scores of a few days worth of activity.

Here are our actual scoring rules at the time I write this. There’s every chance that the rules will be different by the time you read this, and like most crowd-curation sites on the Web, we don’t particularly plan to update the details publicly. But I do want to present this as a specific example of the kinds of things that affect the ordering.

  • We only show Stories in a score-ordered list. (The Feed.) However, we do score Users and Places, because their scores are used for snapshots. We do this based on the opt-outable “activity” reporting:
    • Moving in the last 10 seconds bumps the User’s score by 0.02.
    • Entering a place bumps the Place’s score by 0.2.
  • Snapshot Stories have an initial score that is the decayed average of the User and Place – but a minimum of 1.
  • Concurrency Stories get reset whenever anyone enters or leaves, to a value of nUsersRemaining/2 + 1.
  • All scores have a half-life of 3 days on the part of the score up to 2, and 2 hours for the portion over 2. Thus a flurry of activity might spike a user or place score for a few hours, and then settle into the “normal high” of 2.  This “anti-windup” behavior allows things to settle into normal pretty quickly, while still recognizing flash mob activity.

 

For example, under these rules, one needs to move for about 3:20 minutes / day to keep your score nominally high (2.0).  More activity will help the snapshots you create during the activity, but only for a while, and snapshots the next day will only have an nominally high effect.

As another example of current rules, an event with 25 people will bump a place score by 5:

  • If it started at 2, it will back down to 4.5 in two hours, 2.5 in six hours, and back to 2 in 10 hours.
  • If it started at 0, it we be at 3.5 in two hours, and then roughly as above.

Search

We currently search the filter on the client, filtering from the 100 highest scoring results that we receive from the server. Each typed word appears exactly (except for case) within a word of the description or other metadata (such as the word ‘concurrency’ for a concurrency story). There is no autocorrect nor autocomplete, nor pluralization nor stemming. So, typing “stacks concurrency” will show only the concurrency story for the place named stacks. “howard.stearns snapshot” will show only snapshots taken by me.

When the volume of data gets large enough, I expect we’ll add server-side searching, with tags.

Conclusion

We feel that by using the “wisdom of crowds” to score and order suggestions of what to do, we can:

  • Make it easy to find things you are interested in
  • Make it easy to share things you like
  • Allow you to affirm others’ activities and have yours affirmed
  • Connect people quickly
  • Create atomic assets that can be shared on various mediums

In doing so, we hope to create a better experience by bringing users to the great people and content that are already there, and encourage more great content development.

Discovery Part 1: The Issue

“What is there to do here?”

banner

High Fidelity is a VR platform.

It’s pretty clear how to market a video game. It’s a little bit harder to connect users to a new VR chat room, conferencing app, or Learning Management System. We’re not making any of these, but rather a platform on which such apps can be made by third party individuals and companies. Once someone has our browser-like “Interface” software, people can connect to any app or experience on our platform — if they know where to go.

The tech press is full of stories about The Chicken and Egg problem: adoption requires interesting content, but content development follows adoption. Verticals such as gaming make that problem a little more focused, but games still require massive up-front investment in technology, content, and marketing. We are instead betting on user-generated content in both the early market and, as with the Internet in general, we expect user-generated content to be the big story in mainstream adoption as well. This makes it that much more important to hook users up with interesting people, places, and things in-world. The early Web used human-curated directories for news, financial info, games, and so forth, but we’re not quite sure what categories are going to have the most interesting initial experiences. And neither do our users!

We want an easy way for users to find interesting people, places, and things to explore, which doesn’t require High Fidelity Inc. to pick and decide what’s hot. We also want an easy way for creators to let others know about their content, without having to go through us nor a third party to market it.

Crowd Curation

One powerful model that has emerged for recognizing interest in user-generated content is crowd curation: a strong signal is produced by real user activity, and used to drive suggestions.

The signal can be explicit endorsement (likes, pins, tweets, and links) or implicit actions (achievements, or funnel actions such as visiting or building). The signals are weighted in favor of the users you value most: friends, strangers with lots of “karma”, or sites with lots of links to them.

There are various ways in which this information is then fed back to users. Facebook and Twitter provide a feed of interesting activity. Google offers suggestions as you type, and more suggestions after you press return. Amazon tells you at checkout that people who bought X also bought Y. However, the underlying crowd curation concept is roughly the same, and it has accelerated early growth (Twitter, Zynga), and ultimately provided enormous value to large communities and their users (Google, Facebook, eBay).

Of course, these systems don’t crawl High Fidelity virtual worlds, so we need to make our own crowd curation system, or find a way to expose aspects of our virtual world to the Web, or both. But more importantly, what do we want to share?

For Real

So, what should we suggest to our users? Ultimately, we want to suggest anything that will give a great experience: people to meet or catch up with, places to experience, and things from the marketplace to use wherever you are. But in these early days, your friends are not likely to have gear or to be online at any given moment. Places are under construction and without reputation. The marketplace is just forming.

Initially then, we’re starting with just two kinds of suggestions:

 

  1. Taking an in-world snapshot is something that any user can do from any place, and it puts participants onto the road to being content creators. The picture can be taken with the click of a button and requires no typing, which can be hard to do in an HMD. We automatically add the artist username and the place name as description. It often gives a pretty good idea of what’s happening, and we’ve arranged for clicking on the picture to transport you to the very place it was taken, facing the same way, so that you can experience it, too. Finally, it creates a nice visual artifact that you can take home and share outside of High Fidelity.

snapshot-card

 

  1. Even without necessarily knowing another High Fidelity user, it’s definitely more fun to go to places where people are. Even though we’re just in beta, there are always a few places that have people gathered, but they’re not always the same places. It’s hard for a person to know where to look. So we’re making suggestions out of public places, ordered by the number of people currently in them. No need for anyone to do or make anything on this one, as we pick up concurrency numbers automatically from those domains that share this info. (Anyone who makes a domain can control access to it.)

concurrency-card

 

These suggestions appear when you press the “Go To” button, which also brings up an address bar where you can directly enter a known person or place, or search for suggestions (just like a Web browser’s address bar). I can imagine someday offering information about related content in various situations, or a real time messaging and ticker widget for those who want to keep tabs on the latest happenings, but primarily we just want to allow people to “pull” suggestions when they are specifically looking for something to do.

In short, when a person presses the “Go To” button, they get a scrollable list of suggestions that give a visual sense of what has been happening recently, which offers people the chance to visit.

feed

Suggestions are available to both anonymous and logged in users: we don’t want to require a login to use High Fidelity. However, we would like to offer personalized feeds in the future based on your (optional) login. We also don’t share anything that you have not explicitly shared, and such sharing links to your (self-selected, non-real-world) username.

Sharing and searching are not restricted to our system. Every suggestion has a public Web page that can be shared on Facebook or Twitter, or (soon) searched on Google and other search engines. Clicking the picture or link on that page in a browser brings you to that same place in-world if you have Interface installed, just as if you had clicked on the suggestion within Interface. We feel this will make it easier for content creators to promote their places, snapshots, and eventually, marketplace items. We hope to create a “virtuous circle”, in which search and sharing brings people in through external networks that are much bigger than ours, introduces them to more content, and makes it easy for them to further make and share.

Does It Matter?

In the two weeks before we introduced an early form of this, a bit more than a third of our beta users were within 10 meters of another user in-world on any given day (excluding planned gatherings). Then we introduced a prototype of the concurrency suggestions (“N people are hanging out in some place name”), and over the next two weeks, nearly half each day’s users were near another a some point in their day. Since then, we’ve done other things to increase average concurrency, and we’re now near 100%.

I don’t have good historical data on snapshots, and the new data is quite volatile. Our private alpha “random image thread” averaged a healthy five entries a day for more than two years, including entries with no pictures and entries with multiple pictures. Now, on days when something interesting is happening, we get 20 or 30 explicitly shared pictures, with most days generating three to eight.

Next: How we do that

Dude – Who brought the ‘script’s to the party?

party

This week, some of our early adopters got together for a party in virtual reality. One amazing thing is how High Fidelity mixes artist-created animations with sensor-driven head-and-hand motion. This is done automatically, without the user having to switch between modes. Notice the fellow in the skipper’s cap walking. His body, and particularly his legs, are being driven by an artist-created animation, which in turn is being automatically selected and paced to match either his real body’s physical translation, or game-controller-like inputs. Meanwhile, his head and arms are being driven by the relative position and rotation of his HMD and the controller sticks in his hands(*).

So, dancing is allowed.

But the system is also open to realtime customization by the the participants. Some of the folks at the party spontaneously created drinks and hats and such during the party and brought them in-world for everyone to use. A speaker in the virtual room was held by a tiny fairy that lip-sync’d to the music. One person brought a script that altered gravity, allowing people to dance on the ceiling.

dancing


*: Alas, if the user is physically seated at a desk, they tend to hold their hand controllers out in front of them. You can see that with the purple-haired avatar.

Feeding Content

Our latest High Fidelity Beta release builds on June’s proof of concept, which suggested three visitable places above the address bar. Now we’re extending that with a snapshot feed. This should assist people in finding new and exciting content, and seeing what’s going on across public domains.
Just The Basics:

I. There is now a snapshot button in the toolbar: It works in HMD, and removes all HUD UI elements from the fixed aspect-ratio picture. If you are logged in to a shareable place, you also get an option to share the snapshot to a public feed. (Try doing View->Mirror and taking a selfie!)

snapshot-review

II. The “Go To” address bar now offers a scrollable set of suggestions that can be places or snapshots: The two buttons to the right of the address bar switch between the two sets, and typing filters them. Clicking on a place takes you to that named place, but clicking on a snapshot opens another window with more info. You can then visit the place that snapshot was taken by clicking on the picture, explore the other snapshots taken by that person or in that place, or share the picture to Facebook if you choose. If your friends follow your share to the picture on the Web, they can click on the picture to jump to the same place – if they have Interface installed.

feed

(None of this has anything to with our old Alpha Forums picture feed, which isn’t public or scalable, nor are there changes to the old control-s behavior.)
Where We’re Headed:

There’s a lot more we can do with this, but we wanted to release what we have now and find out what’s important to you.

  1. We’re also thinking about other activity and media you might like to share and see in the feed, such as joining a group or downloading from marketplace.
  2. How might we use the “wisdom of crowds” to score and order the suggestions, based on real activity that people find useful?
  3. The community is quite small right now, and often your real world or social media friends do not have HMDs yet. So for now there there’s just one shared public feed of snapshots. As we grow, we’ll be looking at scaling our infrastructure, and with it, more personalized sharing options.

As we move forward:

  • We don’t want to require a login to use High Fidelity or to enjoy the suggestions made by the feed. We do require a login to share, and we’d like to offer personalized feeds in the future based on your (optional) login.
  • We don’t want to require connecting your High Fidelity account to any social media, but we do want to allow you to do so.
  • We don’t want to share anything without you telling us that it is ok to do so.

Ben’s Social VR Adventure

being-of-light

There are lots of old and new VR sites with prognostications. This guest blog from a VCpreneur has four “requirements” for social VR, and it sparked some discussion at our office.

It reads to me very much like this investor fellow might be talking with some startup that features his four points, and he just wants to sort out whether the concepts are sticky. I’m imagining a company that uses Facebook for login, uploads your 2D and 3D albums and videos into your 3D space (which is zero-install via WebVR), lets you and your friends like/share from in-world, and folks that click on those shared things in Facebook get brought into your 3D space at the corresponding item. (I pitched such a thing five years ago as “Beyond My Wall”, using the Unity browser plugin instead of WebVR.)

One of the blogger’s “requirements” was that participants use their real-world identity, and this was what interested our gang, both for and against. I think this is a red herring. Although I use my real name online alot, my gut is that an alwful lot of VR is going to be about sex, and folks won’t want to use their real name. But overall, I don’t think it’s killer one way or the other. I’m guessing that he’s trying to turn the limitation of Facebook-login into a feature called real-world identity, and I think it’s a stretch. There’s clearly a lot of things that went into making Facebook win over MySpace, and I’m not persuaded that that real-world identity is the magic ingredient. Indeed, both Twitter and his other example, Youtube, are both counterexamples. I think real-world identity can be a part of many potentially successful killer apps, but I don’t see it being required for all such killer apps. (I think verified identity, whether real-world or not, will be a great service for High Fidelity to offer, but it won’t be required for everything.)

I do think he’s on the right track, though, with his feature set including pre-established friend links and content-sharing. But I’m not sure the guy has really understood why those two things matter or what they are part of. They feed the virality of a lot of social media winners, but the magic is in the viral math, not specifically in the features. For example, “pre-established friends” is helpful, but not necessary for Twitter, Youtube, or EBay. I think that each one of a Facebook liked-story, Twitter hashtag, Ebay auction, and Youtube video/discussion page forms a micro-community of interest with a specific “address” that can iself be shared out-of-band. Each user can and does “belong” to many such micro-communities. I believe that’s the real driver for virtuous-circle virality. High Fidelity is going to be great for this, because anyone can easily create any number of separate, persistent, domains of interest, and each one can have the computation shared by the people who are interested in it. (Our upcoming Beta makes it easy to organize one domain per user’s computer, which I think is a good initial model.) Nothing else I’ve seen (not even Beyond My Wall) can scale like that. This is so very different from social VR startups that offer even a large number of hosted chat rooms.

Of course, none of this is to say what is interesting about any of these domains. That is a separate – and more important – question that remains. The blog (and this response) are about what qualities are necessary, given something interesting, for it to catch on.

Separately, the comment from DiGiCT was interesting, that the huge numbers of Chinese HMD sold are just a series of upgrades to very bad/cheap/unsatisfying units. I wonder if that’s true.

How To Use High Fidelity

no-cow-tipping

tl;dr

1.Install the “Sandbox” app(We currently do not provide pre-built installers for Linux, but techies can build from open source for all platforms.) The “Sandbox” is the HighFidelity logo that appears in your system tray. (It is in the corner near where the wifi logo is on your screen, and may be behind the arrow if you have a lot of system tray apps. Sandbox starts automatically on Windows.)

2. Click on the Sandbox app and choose “Go Home” to visit your own personal domain. You can also visit another domain by clicking on a “hifi:” link that someone has given you on a Web page or email.

This is your user interface. While you use this, your actions are seen by others through your representation as an avatar.

3. Use the arrow keys to move, or the joysticks/touchpads on a hand controller. (You may need to turn on some controllers using the Avatar -> Input Devices menu.) Change between desktop display and Head Mounted Display (HMD) using the Display menu. (HMDs and controllers need to be plugged in and have their manufacturer’s software installed before starting the user interface.)

4. Most of the system behavior is defined by scripts that you can get from friends or from the Examples, or that are attached to the objects you find in the domains that you visit. Some initial behavior that you get “out of the box” includes:

  • The Go To button, giving you a directory of places.
  • The Examples button, giving you different avatars to choose from, or things to add to domains that allow it (like your own Home domain).
  • The Edit button, which lets you edit any object in in any domain that lets you.
  • Pressing the “return” key gives you an address bar to specify another place or user to visit.
  • The microphone is on! Use a headset.

If you’ve got 10 people sharing wifi, or have a Comcast connection that turns into molasses on a Friday night, things might be a bit slow when your network sucks. This is especially true when visiting a new domain for the first time. Also, some domains are faster than others. If things don’t feel right, try again later, or see “Stats”, below. Continue reading

The Bird Is the Word!

Believe it or not, there’s some great engineering and sportsmanship behind this:

View post on imgur.com

bird-is-the-word

We’re trying to create low-latency, high-fidelity, large-scale multi-user experiences with high-frequency sensors and displays. It’s at the edge of what’s possible. The high-end Head Mounted Displays that are coming out are pretty good, but even they have been dicey so far. The hand controllers have truly sucked, and we’ve been basing everything on the faith that they will get better. But even our optimism had waned on the optical recognition of the LeapMotion hand sensor. We made it work as well as we could, and then left it to bit-rot.

But yesterday LeapMotion came out with a new version of their API library, compatible with the existing hardware. Brad is the engineer shown above, and no one knows more than him how hard it is to make this stuff work. He was so sure that the new software would not “just work”, that he offered a bottle of Macallen 18 year old scotch to anyone who could do so. Like the hardworking bee that doesn’t know it can’t possibly fly, our community leader, Chris, is not an engineer and and just hooked it up.

chris-baring-magic-leap

In just minutes he made this video to show Brad, who works from a different office.

True to his word, Brad immediately went online and ordered the scotch, to be sent to Chris. Brad then dug out his old Leap hardware from the drawer next to his CueCat and made the more articulate version above.

We sent it to few folks, including Stanford’s VR lab, which promptly tweeted it, with the caption, “One small step for Mankind? Today we saw 1st networked avatar with fingers”.

So now we have avatars with full body IK driven by 18-degree of freedom sensors, plus optical tracking of each finger, facial features and eye gaze, all networked in real time to scores of simultaneous users, with physics.  In truth, we still have a lot of work to do before anyone can just plug this stuff in and have it work, but it’s pretty clear now that this is going to happen!

One more dig: Apple has long said that they can only make things work at the leading edge by making the hardware and software together, non-interoperable to anyone else. Oculus has said that networked physics is too hard, and open platforms are too hard. Apple and Oculus make really great stuff that we love. We make only open source software, and we work with all the hardware, on Windows, Mac, and Linux.

Makers’ Mash-Up

As the nascent VR industry gears up for The Year of VR, the press and pundits are wrestling with how things will break out. There are several Head Mounted Display manufacturerers that will release their first products early this year, and they are initially positioning them as extensions of the established games market. The idea is that manufacturers need new content for people to play on their boxes, and game studios need new gizmos in which to establish markets for their content. The Oculus will initially ship with a traditional game controller. The Vive will provide hand sensor wands that allow finer manipulation. They’re both thinking in terms of studio-produced games.

The studio/manufactuer model is well-understand and huge — bigger than the motion picture industry. The pundits are applying that framework as they wonder about the chicken-and-egg problem of content and market both requiring each other to come first. Most discussion takes for granted a belief that the hardware product market enables and requires a studio to invest in lengthy development of story, art, and behavior, followed by release and sale to individuals.

But I wonder how quickly we will move beyond the studio/manufacturer model.

I’m imagining a makers’ mash-up in which people spontaneously create their own games all the time…

  • a place where people could wield their Minecraft hammers in one hand, and their Fruit Ninja swords in the other.
  • a place that would allow people to teleport from sandbox to sandbox, and bring items and behaviors from one to another.
  • a place where people make memories by interacting with the amazing people they meet.

I think there’s good reason to believe this will happen as soon as the technology will enable it.

Second Life is an existence proof that this can work. Launched more than a dozen years ago, its roughly 1M montlhy users have generated several billion dollars of user-created virtual goods. I think SL’s growth is maxed out on its ancient architecture, but how long will it take any one of the VR hardware/game economies to reach that scale?

Ronald Coase’s Nobel-prize-winning work on the economics of the firm says, loosely, that companies form and grow when growing reduces their transaction costs. If people can freely combine costume, set, props, music, and behaviors, and are happy with the result, the economic driver for the studio system disappears.

I think the mash-up market will explode when people can easily and inexpensively create items that they can offer for free or for reputation. We’ve seen this with the early Internet, Web, and mobile content, and offline from Freecycle to Burning Man.

High Fidelity’s technical foundation is pretty close to making this happen at a self-sustaining scale. There are many design choices that tend to promote or restrict this, and I’ve described some in the “Interdimensional conflicts” section at the end of “Where We Fit”. Some of the key architectural aspects for a makers’ mash-up are multi-user, fine-manipulation, automatic full body animation, scriptable objects that can interact with a set of common physics for all objects, teleporting to new places with the same avatar and objects, and scalability that can respond to unplanned loads.

Where do we fit?

computer-dimensions

There are many ways by which to categorize work in Virtual Reality: by feature set, market, etc. Here are some of the dimensions by which I view the fun in VR, and where High Fidelity fits in.  To a first-order approximation, these axes are independent of each other. It gets more interesting to see how they are related, but, first the basics.

Scope of Worlds: How do you count VRs?

As I look at existing and developing products for virtual worlds, I see different kinds of cyberspaces. They can be designed around one conceptual place with consistent behavior, as exemplified by social environments like Second Life or large-scale MMORGs like World of Warcraft. Or they can replicate a single small identical meeting space or game environment, or some finite set of company-defined spaces. Like my previous work with Croquet, High Fidelity is a “metaverse” — a network in which anyone can put up a server and define their own space with their own behavior, much like a Web site. People can go from one space to another (like links on the Web), taking their identity with them. Each domain can handle miles of virtual territory and be servered from one or more computers depending on load. Domains could be “side by side” to form larger contiguous spaces, although no one has done so yet.

Scope of Market: How do you sell VRs?

Many see early consumer VR as a sustaining technology for games, with the usual game categories of FPS, platformers, builders, etc. The general idea is that gaming is a large market that companies know how to reach. Successfull games create a feeling of immersion for their devotees, and VR will deepen that experience.

An emerging area is in providing experiences, such as a concert, trip, or you-are-there story.

Others see immersion and direct-manipulation UI as providing unique opportunities for teaching, learning, and training, or for meetings and events. (The latter might for playing or for working.)

Some make tools and platforms with which developers can create such products.

High Fidelity makes an open source platform, and will host services for goods, discovery, and identity.

Scope of Technology: How do you make and populate VRs?

Software:

By now it looks like most development environments provide game-like 3D graphics, some form of scripting (writing behavior as programs), and built-in physics (objects fall and bounce and block other objects). Some (including High Fidelity and even some self-contained packaged games) let users add and edit objects and behaviors while within the world itself. Often these can then be saved and shared with other users.

A major technical differentiator is whether or not the environments are for one person or many, and how they can interact. For example, some allow small numbers of people on the Internet to see each other in the same “space” and talk by voice, but without physical interaction between their avatars. Others allow one user to, e.g., throw an object to a single other user, but only on a local network in the same physical building. High Fidelity already allows scores of users to interact over the Internet, with voice and physics.

Hardware:

Even Low-end Head Mounted Displays can tell which way your head is turned, and the apps rotate your point of view to match. Often these use your existing phone, and the incremental cost beyond that is below $100. However, they don’t update fast enough to keep the image aligned with your head as it moves, resulting in nausea for most folks. The screen is divided in two for separate viewpoints on each eye, giving a 3-dimensional result, but at only half the resolution of your phone. High-end HMDs have more resolution, a refresh rate of at least 75Hz, and optical tracking of your head position (in addition to rotation), as you move around a small physical space. Currently, high-end HMDs connect to a computer with clunky wires (because wireless isn’t fast enough), and are expected to cost several hundred dollars. High fidelity works with conventional 2D computer displays, 3D computer displays, and head-mounted displays, but we’re really focusing the experience on high end HMDs. We’re open source, and work with the major HMD protocols.

A mouse provides just 2 degrees of free motion: x and y, plus the buttons on the mouse and computer. Game controllers usually have two x-y controllers and several buttons. The state of the art for VR is high frequency sensors that track either hand controllers in 3-position + 3-rotation Degrees Of Freedom (for each hand), or each individual finger. Currently, the finger sensors are not very precise. High Fidelity works with all of these, but see the next section.

Capture:

One way to populate a VR scene is to construct 3D models and animated characters using the modeling tools that have been developed for Computer Aided Design and Computer Generated movies. This provides a lot of control, but requires specialized skills and talent.

There is now pretty good camera/software apps for capturing real scenes in 3D and bringing them into VR as set-dressing. Some use multiple cameras to capture all directions at once in a moving scene, to produce what is called “cinematic VR” experiences. Others sweep a single camera around a still scene and stitch together the result as a 3D model. There are related tools being developed for capturing people.

Scope of Object Scale: What do you do in VRs?

The controllers used by most games work best for coarse motion control of an avatar — you can move through a building, between platforms, or guide a vehicle. It is very difficult to manipulate small objects. Outside of games, this building-scale scope is well-suited to see-but-don’t-touch virtual museums.

In the early virtual world Second Life, users can and do construct elaborate buildings, vehicles, furniture, tools, and other artifacts, but it is very difficult to do so with a mouse and keyboard. At High Fidelity, we find that by using two 6-DoF high-resolution/high-frequency hand controllers, manipulating more human-scaled objects becomes a delight. Oculus‘ “toy box” demonstration and HTC’s upcoming Modbox show the kinds of things that can be done.

Interdimensional conflicts:

These different axes play against each other in some subtle ways. For example:

  • Low-end HMD would seem to be an obvious extension of the game market, but the resolution and stereo make some game graphics techniques worse in VR than on desktop or consoles. The typical game emphasis on avatar movement may accentuate nausea.
  • High-end hand controllers make the physics of small objects a delightful first-person experience, but it’s profoundly better still when you can play Jenga with someone else on the Internet, which depends on software design limitations. (See October ’15 video at 47 seconds.)
  • Game controllers provide only enough input to show avatar motion using artist-produced animations. But 18-DoF (in two-hand-controllers plus high-end HMD) provide enough info to compute realistic skeletal motions of the whole body, even producing enough info to infer the motion of the hips and knees that do not have sensors on them. (See October ’15 video at 35 seconds.) This is called whole-body-Inverse-Kinematics. High Fidelity can smoothly combine artist-animation (for walking with a hand-controller joystick), with whole-body-IK (for waving with your hand controller while you are walking).
  • These new capabilities allow for things that people just couldn’t do at all before, as well as simplifying things that they could not do easily. This opens up unchartered territory and untested markets.

Notice that all the mentioned interactions depend on The Scope of Object Scale (above), which isn’t often discussed in the media. It will be interesting to see how the different dimensions of VR play against each other to produce a consistent experience that really works for people.


 

A word about Augmented Reality: I think augmented reality — in which some form of VR is layered over our real physical experience — will be much bigger than VR itself. However, I feel that there’s so much still to be worked out for VR along the above dimensions and more, that it becomes difficult to make testable statements about AR. Stay tuned…

The Big Show

We do a lot of live demos. Our VR software is in alpha and we sometimes run into bugs, but we pride ourselves on showing the true state of things instead of slides and videos. It’s often a build with radical changes created just a few moments before. And yet the most common trouble is the unpredictable network connectivity one finds at demo venues. Our folks have done quite a few tethered to a cell phone.

But this demo from Sony is in a whole other class. Ouch. I’m sure we’ll all have great hand controllers within a year or so, but right now, hand controller hardware really sucks.

By the way, I think their IK looks great, and I especially like the jump. But notice that the avatars are stuck on a pedestal instead of moving around. I haven’t seen anyone combine IK with artist-designed animations the way we have.