Playing the Game of Life with Elixir Processes

I’ve always been fascinated with Conway’s Game of Life, and I’ve always wondered what the Game of Life is like from an individual cell’s perspective.

The constant threat of dying of loneliness, three person mating rituals, and the potential to achieve immortality! What a strange and beautiful world…

In this article, we’ll use Elixir processes, supervision trees, and the Registry to implement Conway’s Game of Life from this interesting perspective.

The Game of Life

Most basic implementations of the Game of Life represent the universe as a large two-dimensional grid. Each spot in the grid is either “alive”, or “dead”.

A typical Game of Life representation.

Once every “tick”, the simulation will loop over each cell in the universe. If the cell is dead and has exactly three living neighbors, it will be born into the next generation. If the cell is living and has two or three neighbors, it will live on during the next generation. Otherwise, the cell will die or remain dead.

When let loose on an initial configuration of living cells, the continuous application of these rules can create an incredibly complex, seemingly organic system.

The Architecture

Rather than following the standard approach of using a finite two-dimensional array to represent our universe, let’s flip this problem upside down.

Let’s represent each living cell as an active Elixir process living somewhere on on our server. Each cell process will hold it’s location as an {x, y} tuple as its state.

We’ll need some way of finding a cell’s neighbors, which means we’ll need to be able to look up a cell based on its given {x, y} location. This sounds like a classic process discovery problem and it gives us an excellent excuse to try out Elixir’s new Registry module.

Our cells will be fairly independent; they’ll manage their own position, and determine when it’s time to die or give birth to another cell. However, we’ll need some additional outside process to tell each cell in the universe when to “tick”. We’ll call this controller process the “Universe”.

“Life calls the tune, we dance.”

Given those basic components, we can draw up a basic dependency tree of our application. We’ll need a top level supervisor managing our universe and cell registry, along with a supervisor to dynamically manage each cell process.

The Supervisors

Before we dive into the meat of our application, let’s take a quick look at how we’ll implement our application’s supervisors.

Our top level supervisor will be called Universe.Supervisor. It simply spins up a single instance of the Universe worker, the Cell.Supervisor supervisor, and the Cell.Registry which is an instance of Elixir’s Registry module:


children = [
  worker(Universe, []),
  supervisor(Cell.Supervisor, []),
  supervisor(Registry, [:unique, Cell.Registry])
]
supervise(children, strategy: :one_for_one)

Notice were’s using a :one_for_one supervision strategy here. This means that all of our children will be immediately started, and if a child ever fails, that process (and only that process) will be restarted.

Our Cell.Supervisor, the supervisor that manages all dynamically added cell processes, is a little more interesting.

Instead of immediately spinning up child processes, we create a template describing the the type of process we’ll be supervising in the future:


children = [
  worker(Cell, [])
]
supervise(children, strategy: :simple_one_for_one, restart: :transient)

A high level view of our application's structure.

The :simple_one_for_one strategy informs the system that we’ll be dynamically adding and removing children from this supervision tree. Those children will be Cell worker processes.

The :transient restart strategy means that if the Cell process is killed with a :normal or :shutdown message, it will not be restarted. However, if the Cell processes experiences a problem and dies with any other message, it will be restarted by the Cell.Supervisor.

Our Cell.Supervisor module also has a function called children:


def children do
  Cell.Supervisor
  |> Supervisor.which_children
  |> Enum.map(fn
    {_, pid, _, _} -> pid
  end)
end

The children function returns all living cell processes currently being supervised by Cell.Supervisor. This will be useful when we need to tick each cell in our Universe module.

The Universe

Our Universe module is the driving force in our Game of Life simulation. It’s literally what makes the cells tick.

If we had to tell Universe what to do in plain English, we might say:

Get all living cells. Asynchronously call tick on each one. Wait for all of the ticks to finish. Kill, or reap, all cells that will die from loneliness, and create, or sow, all of the cells that will be born.

Now let’s compare those instructions with our the code in our tick handler:


get_cells()
|> tick_each_process
|> wait_for_ticks
|> reduce_ticks
|> reap_and_sow

Perfect. I might even go so far as to say that the Elixir code is more readable than plain English.

As we dig into each of these functions, we’ll find that they’re still very descriptive and understandable. The get_cells function simply calls the Cell.Supervisor.children function we defined earlier:


defp get_cells, do: Cell.Supervisor.children

The tick_each_process function maps over each cell process and calls Cell.tick as an asynchronous Task:


defp tick_each_process(processes) do
  map(processes, &(Task.async(fn -> Cell.tick(&1) end)))
end

Similarly, wait_for_ticks maps over each asynchronous process, waiting for a reply:


defp wait_for_ticks(asyncs) do
  map(asyncs, &Task.await/1)
end

reduce_ticks, along with the helper function accumulate_ticks, reduces the response from each call to Cell.tick into a tuple holding a list of cells to be reaped, and a list of cells to be sown:


defp reduce_ticks(ticks), do: reduce(ticks, {[], []}, &accumulate_ticks/2)

defp accumulate_ticks({reap, sow}, {acc_reap, acc_sow}) do
  {acc_reap ++ reap, acc_sow ++ sow}
end

Lastly, reap_and_sow does exactly that: it kills cells marked for death, and create cells queued up to be born:


defp reap_and_sow({to_reap, to_sow}) do
  map(to_reap, &Cell.reap/1)
  map(to_sow,  &Cell.sow/1)
end

Take a look at the entire Universe module on Github.

The Cell

We’ve seen that while Universe is the driver of our simulation, it defers most of the computational work and decision making to individual cells. Let’s dive into our Cell module and see what’s going on.

The Cell.reap and Cell.sow methods we saw in Universe are fairly straight-forward:

The reap function simply calls Supervisor.terminate_child to remove the given cell process from the Cell.Supervisor tree.


def reap(process) do
  Supervisor.terminate_child(Cell.Supervisor, process)
end

Similarly, sow calls Supervisor.start_child to create a new process under the Cell.Supervisor tree, passing in the cell’s position as its initial state:


def sow(position) do
  Supervisor.start_child(Cell.Supervisor, [position])
end

The real magic of our Game of Life simulation happens in the cell’s tick function.

During each tick, a cell needs to generate a list of cells to reap (which will either be an empty list, or a list containing only itself), and a list of cells to sow.

Generating the to_reap list is easy enough:


to_reap = position
|> do_count_neighbors
|> case do
     2 -> []
     3 -> []
     _ -> [self()]
   end

We count the number of living neighbors around the cell. If the cell has two or three neighbors, it lives on to the next generation (to_reap = []). Otherwise, it dies from loneliness (to_reap = [self()]).

The do_count_neighbors functions does what you might expect. Given a cell’s position, it finds all eight neighboring positions, filters out all dead neighbors, and then returns the length of the resulting list of living neighbors:


defp do_count_neighbors(position) do
  position
  |> neighboring_positions
  |> keep_live
  |> length
end

After we’ve generated our to_reap list, our cell needs to generate a list of cells to be born.

From an individual cell’s perspective, this is a process of looking for any dead (unoccupied) neighboring positions and filtering out those that do not have enough living neighbors to be born into the next generation:


to_sow = position
|> neighboring_positions
|> keep_dead
|> keep_valid_children

The keep_valid_children function goes through the provided list of unoccupied positions, filtering out positions with a neighbor count not equal to three:


defp keep_valid_children(positions) do
  positions
  |> filter(&(do_count_neighbors(&1) == 3))
end

This means that only dead cells with exactly three neighbors (one of which is the current ticking cell) will be born into the next generation.

Now that we’ve generated out to_reap and to_sow lists, our cell process is finished ticking.

We can send our reply back to the universe, being sure to preserve position as our current state:


{:reply, {to_reap, to_sow}, position}

Take a look at the entire Cell module on Github.

Finding Neighbors with Registry

When generating both the to_reap and to_sow lists, cells were required to determine if neighboring cells were living or dead.

This was done with the keep_live and keep_dead functions, respectively:


defp keep_live(positions), do: filter(positions, &(lookup(&1) != nil))


defp keep_dead(positions), do: filter(positions, &(lookup(&1) == nil))

The key here is that we’re calling lookup on each position. The lookup function translates a cell’s position into a PID for that cell’s active process.


def lookup(position) do
  Cell.Registry
  |> Registry.lookup(position)
  |> Enum.map(fn
    {pid, _valid} -> pid
    nil -> nil
  end)
  |> Enum.filter(&Process.alive?/1)
  |> List.first
end

Here is where the Registry shines.

We’re using Registry.lookup to find a process in our Cell.Registry based on a given {x, y} position.

Registry.lookup will give us a list of {pid, value} tuples (or an empty list). Since we only want the pid, we can pull it out of the tuple.

Next, we filter the resulting PIDs with Process.alive?. After reaping a cell with Supervisor.terminate_child, the cell’s process will be removed from the Cell.Supervisor supervisor, but the process may not be fully removed from the Cell.Registry.

This means our cells can potentially interact with “ghost neighbors”; neighboring cells who are in the process of dying, but are not quite completely dead.

Adding a Process.alive? filter prevents our cell from interacting with this ghost neighbors (and prevents a very frustrating, subtle bug).

Running the Simulation

Now that we’ve built our process-driven simulation, it’s time to test it out.

We can fire up our Game of Life environment by starting an interactive Elixir shell:


iex -S mix

Next, let’s spawn three cells in a row. This will create a “blinker” pattern:


Cell.sow({0, 0})
Cell.sow({1, 0})
Cell.sow({2, 0})

Now let’s fire up Erlang’s observer to get a high level view of our universe:


:observer.start

We can see the three cells we just added to the universe below the Cell.Supervisor supervision tree. Also notice that those processes are linked to the Cell.Registry process.

Our application structure as it exists in the Erlang observer.

To test out our Cell.Supervisor, let’s manually kill one of our cell processes. Send a kill exit message to one of the cells, and notice that after the process dies, another process immediately takes its place.

This means that any unintended errors in a Cell process won’t bring down our entire life simulation. Awesome!

Now that our initial conditions are set up, let’s tick our universe:


Universe.tick

Switching back to our observer, we can see that two of the three cell processes have been removed and two new processes have been added. If we look at the state of these new processes, we’ll see that they live at positions {1, 1}, and {1, -1}, as expected.

If we tick our universe again, we would see that those two processes would be killed, and two new processes would be added in their place. Their positions would oscillate back to {0, 0} and {2, 0}. Notice that the process for the cell living at position {0, 1} is still alive and well.

We can tick our universe as many times as we want:


1..10_000
|> Enum.map(fn n -> Universe.tick end)

After all of the ticks are processed, we can switch back to our observer and see that we still have three living cells, as expected.

Cells living as processes in the Erlang observer.

Let’s restart our universe and try again with a more interesting pattern. Let’s try a “diehard” pattern, which is a methuselah that dies after 130 generations:


[
                                                  {6, 2},
  {0, 1}, {1, 1},
          {1, 0},                         {5, 0}, {6, 0}, {7, 0},
]
|> Enum.map(&Cell.sow/1)


1..130
|> Enum.map(fn
              n -> Universe.tick
                   :timer.sleep(500)
            end)

If you watch your observer as it slowly runs through the each tick, you’ll see that the number of active processes skyrockets and then eventually fades to zero.

Final Thoughts

Truth be told, the goal of this project was to get deeper hands-on experience with Elixir processes, supervision trees, and the new Registry functionality.

At the end of the day it was an excellent experience. I learned quite a few important lessons the hard way. If you’re interested in learning Elixir or how to “think in processes”, I highly recommend you take on a similar project.

While this Game of Life implementation isn’t the fastest or most efficient, it does come with its interesting benefits.

It’s incredibly resilient. Every cell process, and the universe process can fail and restart seamlessly. Catastrophic failure can only take place if either the Universe.Supervisor, or the Cell.Supervisor fail, which is unlikely to happen.

Four Erlang schedulers running our Game of Life simulation on a quad-core processor.

It’s concurrent and parallel out of the box. Our asynchronous calls to Cell.tick are distributed across every CPU on our node. The Erlang VM automatically takes full advantage of its environment and orchestrates the running of these parallel processes.

As far as future work for this project, I have lots of ideas.

I’d like to give cells more independence, and remove the need for the Universe driver module. I imagine each cell automatically trying to progress into future generations as soon as all of its neighboring cells have all necessary information to do so.

I’d also like to spread the simulation across multiple nodes. I imagine a massive Game of Life simulation running on dozens of EC2 instances, orchestrated through an edeliver powered release.

Lastly, I’d like to give the simulation a simple web-based user interface, and potentially the ability to run multiple simulations at once.

If you’d like to take this project out for a test run, or get a better look at the source, be sure to check it out on Github!