How do distributed systems work?

How do you coordinate activity across a network? People are doing this all the time, with varying degrees of success. But how is it supposed to work? What is the model to be followed? When I graduated mid-eighties, “Distributed Systems” was still a graduate specialty subject, not a pervasive guiding principle. Today, people like myself don’t seem to have a common ontology of approaches. Well, it’s about time.

The trivial solution is always valid. Don’t. “Web applications” aren’t really distributed at all. In 3-tier architecture terms, each client/Web-server talks only to any one of a number of individual monolithic application servers that also don’t coordinate with each other. Instead, these application servers change all shared data through a single database “back end” running on big-iron. The database provides mechanisms for transaction processing, roll-back, three-phase commit and such (which are frequently not utilized by the application software). The idea with this architecture is that everybody does what they want individually, but that global state is maintained not in a distributed way, but by a single application-global database.

What happens with “massively” multiplayer online roleplayer games? I ought to know, but I can’t find a clear explanation in terms that I can understand.

I imagine that one approach is to run the simulation on the server, and send the clients a stream of authoritative information about where things are now. This is similar to the 3-tier Enterprise Software approach, in that it isn’t really distributed.

So what? Well, one issue common to all central-computation architectures is that they are limited by the performance and reliability of that server.

One problem you might see for a shared-simulation would be lag. You do something and it has to go to the server, wait for new state to be computed and serialized, and then sent back. That might be ok for key-clicks or button presses, but not for mouse movement. To avoid that, one trick is to just not support tight interactions through the mouse. Just define the UI so that only result activities (like mouse up) count. I’ll bet there are a lot of shared environments that are limited this way but are done so well that people don’t even notice. That’s cool. But if you want to start thinking about incorporating other applications into such a model — voice and video, mouse-over, pointer-position, drag, other devices — you have to think about how to handle these separately, outside the model of what’s been built in such systems. For example, maybe a mouse drag is shown to you locally, but everyone else only sees the end-points.

I bet there are hacks where people compute stuff locally and then “correct” it with data from the server, but I imagine that would be a nightmare to program. You could do it in one special case and another, but at what point do things get too hard to manage? (Once apon a time, programmers managed their own call stack before they trusted the compiler to do a better job on programs more complicated than they could achieve with ad hoc stack management. Eventually they also accepted that automatic memory management could handle more complicated usage than their own ad-hoc heap management. Managing distributed state is the same thing.) Also, if you special-case the local stuff and sync with the global, then your local simulation would be “ahead” of the stuff coming back from the server, so you have problems with determining when something happened. (In a game you shoot at someone who has moved.) You could try to timestamp the messages going from the client to the server, but there’s no good way to accurately coordinate millisecond clock time over a network. (There are situations where you can get away with restricting all updates to be “absolute” state (e.g., a position) rather than “relative” (e.g., a delta).) Another problem is load. It is expensive to compute the results and serializing them to send to each client. Having everybody do everything on a server creates a big bottleneck. How big a server do you need for “everybody in the world?”

An alternative is to run everything locally and then try to let everyone know about everyone else’s results. But:

  • I don’t know how you would coordinate millisecond time over the net.
  • I don’t know how you would resolve conflicts. (There’s a distributed algorithm for doing so called Paxos, but it is very complicated and takes a few network round-trips.)
  • It seems like it would be very difficult to come up with a different ad hoc definition of what state is involved for each kind of object, and processing a delta of that versus what someone else has. I think this is what uni-verse tries to do, but I don’t see how it can actually work.

Croquet’s model makes sense to me, but that could be simply because it’s the only approach that I’ve taken the time to try to understand. It’s naive to think it’s the only thing in the world that actually works, but I’m also old enough to believe that there can be a lot of stuff out there that doesn’t actually work when you push on it. Or is already at the limits of what can be done. Anyway, I learned Croquet’s model from the manual and its predecessors. But sometimes it’s helpful to get a different explanation. Here goes. First, there’s my daughter’s nutshell description. Read it.

More specifically, Croquet runs every simulation locally. Everyone’s executing on a computer, so if you give them all the same inputs at the same time, then you get the same results. This uses only local processing. The only thing that travels over the wire is the inputs: You type a key. I move my mouse. The other guy clicks. The tricks are:

  1. How do you have the machines produce the same result on different processors and operating systems? We run a fast Virtual Machine and do everything important within that so that we get the same answers. (There are a couple of things, like rendering, that don’t effect the results, so these can be done in hardware. In fact, we ship all the rendering to the graphics processor on the video card.)
  2. How do you arrange for everything to happen at the same “time” so that things stay deterministically in sync? In Croquet, each simulation has one router designated on the network. All inputs are sent to the router, and never directly to the simulation running on the machine on which the input is made. The router puts its own timestamp on the message and sends it out to everyone. That’s all the router does. No computation. No decoding. The router is the sole source of time. Everyone then processes the input messages based on the message timestamp. If nothing else is happening, the router also sends a heartbeat to move time along. Everything in the simulation – including things falling or clouds moving – is computed based on the router’s timestamp, not the local machine clock.
  3. How do get each machine to start with the right state? Each user gets a snapshot of memory when they join. Same inputs at the same times, starting from the same initial conditions, gives the same results. This snapshot trick works because of the other two tricks:

    • We can take a snapshot of memory that works on each user platform because it’s all running on identical VMs. This snapshot can be supplied by any of the connected machines, selected at random. At the Collaborative, we always leave one machine running for this purpose. (By the way, we don’t take a snapshot of the whole VM, but just that part which is associated with the particular simulation. A user can be running any number of simulations simultaneously, and they each effectively have their own separate memory. This separation allows me to be using simulations A, B, and D, while you go in and out of A and C separately.)
    • Remember that the flow of time is defined entirely by the router’s timestamp of messages. Time on the simulation stops in between messages, and the heartbeat messages are used to keep things moving. Thus Croquet’s simulation-time is only loosely coupled to wall-clock time. At computer speeds, it’s close enough for people not to notice, but the fact that it is allowed to vary non-linearly with wall-clock time is what allows the simulations to stay correct. When we connect, we immediately start getting the same series of timestamped messages that everyone else gets, but we don’t execute them yet because we don’t have a snapshot to run them in. When we do get the snapshot, we throw away all the messages before the snasphot’s timestamp, and immediately execute our queue of saved messages that follow. Now we’re just like everyone else.

About Stearns

Howard Stearns works at High Fidelity, Inc., creating the metaverse. Mr. Stearns has a quarter century experience in systems engineering, applications consulting, and management of advanced software technologies. He was the technical lead of University of Wisconsin's Croquet project, an ambitious project convened by computing pioneer Alan Kay to transform collaboration through 3D graphics and real-time, persistent shared spaces. The CAD integration products Mr. Stearns created for expert system pioneer ICAD set the market standard through IPO and acquisition by Oracle. The embedded systems he wrote helped transform the industrial diamond market. In the early 2000s, Mr. Stearns was named Technology Strategist for Curl, the only startup founded by WWW pioneer Tim Berners-Lee. An expert on programming languages and operating systems, Mr. Stearns created the Eclipse commercial Common Lisp programming implementation. Mr. Stearns has two degrees from M.I.T., and has directed family businesses in early childhood education and publishing.

2 Comments

  1. Although MMOG oriented, I’ve found <a href=”http://ai.eecs.umich.edu/so…“>Networking Multiplayer Games</a> helpful. It illustrates and compares various distributed models including P2P(though not Croquet) and Second Life.

  2. There’s a “part 2” of this discussion at http://wetmachine.com/i

Comments are closed