Playing the Game of Life with Elixir Processes

Written by Pete Corey on Feb 6, 2017.

I’ve always been fascinated with Conway’s Game of Life, and I’ve always wondered what the Game of Life is like from an individual cell’s perspective.

The constant threat of dying of loneliness, three person mating rituals, and the potential to achieve immortality! What a strange and beautiful world…

In this article, we’ll use Elixir processes, supervision trees, and the Registry to implement Conway’s Game of Life from this interesting perspective.

The Game of Life

Most basic implementations of the Game of Life represent the universe as a large two-dimensional grid. Each spot in the grid is either “alive”, or “dead”.

Once every “tick”, the simulation will loop over each cell in the universe. If the cell is dead and has exactly three living neighbors, it will be born into the next generation. If the cell is living and has two or three neighbors, it will live on during the next generation. Otherwise, the cell will die or remain dead.

When let loose on an initial configuration of living cells, the continuous application of these rules can create an incredibly complex, seemingly organic system.

The Architecture

Rather than following the standard approach of using a finite two-dimensional array to represent our universe, let’s flip this problem upside down.

Let’s represent each living cell as an active Elixir process living somewhere on on our server. Each cell process will hold it’s location as an {x, y} tuple as its state.

We’ll need some way of finding a cell’s neighbors, which means we’ll need to be able to look up a cell based on its given {x, y} location. This sounds like a classic process discovery problem and it gives us an excellent excuse to try out Elixir’s new Registry module.

Our cells will be fairly independent; they’ll manage their own position, and determine when it’s time to die or give birth to another cell. However, we’ll need some additional outside process to tell each cell in the universe when to “tick”. We’ll call this controller process the “Universe”.

“Life calls the tune, we dance.”

Given those basic components, we can draw up a basic dependency tree of our application. We’ll need a top level supervisor managing our universe and cell registry, along with a supervisor to dynamically manage each cell process.

The Supervisors

Before we dive into the meat of our application, let’s take a quick look at how we’ll implement our application’s supervisors.

Our top level supervisor will be called Universe.Supervisor. It simply spins up a single instance of the Universe worker, the Cell.Supervisor supervisor, and the Cell.Registry which is an instance of Elixir’s Registry module:


children = [
  worker(Universe, []),
  supervisor(Cell.Supervisor, []),
  supervisor(Registry, [:unique, Cell.Registry])
]
supervise(children, strategy: :one_for_one)

Notice were’s using a :one_for_one supervision strategy here. This means that all of our children will be immediately started, and if a child ever fails, that process (and only that process) will be restarted.


Our Cell.Supervisor, the supervisor that manages all dynamically added cell processes, is a little more interesting.

Instead of immediately spinning up child processes, we create a template describing the the type of process we’ll be supervising in the future:


children = [
  worker(Cell, [])
]
supervise(children, strategy: :simple_one_for_one, restart: :transient)

The :simple_one_for_one strategy informs the system that we’ll be dynamically adding and removing children from this supervision tree. Those children will be Cell worker processes.

The :transient restart strategy means that if the Cell process is killed with a :normal or :shutdown message, it will not be restarted. However, if the Cell processes experiences a problem and dies with any other message, it will be restarted by the Cell.Supervisor.


Our Cell.Supervisor module also has a function called children:


def children do
  Cell.Supervisor
  |> Supervisor.which_children
  |> Enum.map(fn
    {_, pid, _, _} -> pid
  end)
end

The children function returns all living cell processes currently being supervised by Cell.Supervisor. This will be useful when we need to tick each cell in our Universe module.

The Universe

Our Universe module is the driving force in our Game of Life simulation. It’s literally what makes the cells tick.

If we had to tell Universe what to do in plain English, we might say:

Get all living cells. Asynchronously call tick on each one. Wait for all of the ticks to finish. Kill, or reap, all cells that will die from loneliness, and create, or sow, all of the cells that will be born.

Now let’s compare those instructions with our the code in our tick handler:


get_cells()
|> tick_each_process
|> wait_for_ticks
|> reduce_ticks
|> reap_and_sow

Perfect. I might even go so far as to say that the Elixir code is more readable than plain English.

As we dig into each of these functions, we’ll find that they’re still very descriptive and understandable. The get_cells function simply calls the Cell.Supervisor.children function we defined earlier:


defp get_cells, do: Cell.Supervisor.children

The tick_each_process function maps over each cell process and calls Cell.tick as an asynchronous Task:


defp tick_each_process(processes) do
  map(processes, &(Task.async(fn -> Cell.tick(&1) end)))
end

Similarly, wait_for_ticks maps over each asynchronous process, waiting for a reply:


defp wait_for_ticks(asyncs) do
  map(asyncs, &Task.await/1)
end

reduce_ticks, along with the helper function accumulate_ticks, reduces the response from each call to Cell.tick into a tuple holding a list of cells to be reaped, and a list of cells to be sown:


defp reduce_ticks(ticks), do: reduce(ticks, {[], []}, &accumulate_ticks/2)

defp accumulate_ticks({reap, sow}, {acc_reap, acc_sow}) do
  {acc_reap ++ reap, acc_sow ++ sow}
end

Lastly, reap_and_sow does exactly that: it kills cells marked for death, and create cells queued up to be born:


defp reap_and_sow({to_reap, to_sow}) do
  map(to_reap, &Cell.reap/1)
  map(to_sow,  &Cell.sow/1)
end

Take a look at the entire Universe module on Github.

The Cell

We’ve seen that while Universe is the driver of our simulation, it defers most of the computational work and decision making to individual cells. Let’s dive into our Cell module and see what’s going on.

The Cell.reap and Cell.sow methods we saw in Universe are fairly straight-forward:

The reap function simply calls Supervisor.terminate_child to remove the given cell process from the Cell.Supervisor tree.


def reap(process) do
  Supervisor.terminate_child(Cell.Supervisor, process)
end

Similarly, sow calls Supervisor.start_child to create a new process under the Cell.Supervisor tree, passing in the cell’s position as its initial state:


def sow(position) do
  Supervisor.start_child(Cell.Supervisor, [position])
end

The real magic of our Game of Life simulation happens in the cell’s tick function.

During each tick, a cell needs to generate a list of cells to reap (which will either be an empty list, or a list containing only itself), and a list of cells to sow.

Generating the to_reap list is easy enough:


to_reap = position
|> do_count_neighbors
|> case do
     2 -> []
     3 -> []
     _ -> [self()]
   end

We count the number of living neighbors around the cell. If the cell has two or three neighbors, it lives on to the next generation (to_reap = []). Otherwise, it dies from loneliness (to_reap = [self()]).

The do_count_neighbors functions does what you might expect. Given a cell’s position, it finds all eight neighboring positions, filters out all dead neighbors, and then returns the length of the resulting list of living neighbors:


defp do_count_neighbors(position) do
  position
  |> neighboring_positions
  |> keep_live
  |> length
end

After we’ve generated our to_reap list, our cell needs to generate a list of cells to be born.

From an individual cell’s perspective, this is a process of looking for any dead (unoccupied) neighboring positions and filtering out those that do not have enough living neighbors to be born into the next generation:


to_sow = position
|> neighboring_positions
|> keep_dead
|> keep_valid_children

The keep_valid_children function goes through the provided list of unoccupied positions, filtering out positions with a neighbor count not equal to three:


defp keep_valid_children(positions) do
  positions
  |> filter(&(do_count_neighbors(&1) == 3))
end

This means that only dead cells with exactly three neighbors (one of which is the current ticking cell) will be born into the next generation.


Now that we’ve generated out to_reap and to_sow lists, our cell process is finished ticking.

We can send our reply back to the universe, being sure to preserve position as our current state:


{:reply, {to_reap, to_sow}, position}

Take a look at the entire Cell module on Github.

Finding Neighbors with Registry

When generating both the to_reap and to_sow lists, cells were required to determine if neighboring cells were living or dead.

This was done with the keep_live and keep_dead functions, respectively:


defp keep_live(positions), do: filter(positions, &(lookup(&1) != nil))

defp keep_dead(positions), do: filter(positions, &(lookup(&1) == nil))

The key here is that we’re calling lookup on each position. The lookup function translates a cell’s position into a PID for that cell’s active process.


def lookup(position) do
  Cell.Registry
  |> Registry.lookup(position)
  |> Enum.map(fn
    {pid, _valid} -> pid
    nil -> nil
  end)
  |> Enum.filter(&Process.alive?/1)
  |> List.first
end

Here is where the Registry shines.

We’re using Registry.lookup to find a process in our Cell.Registry based on a given {x, y} position.

Registry.lookup will give us a list of {pid, value} tuples (or an empty list). Since we only want the pid, we can pull it out of the tuple.

Next, we filter the resulting PIDs with Process.alive?. After reaping a cell with Supervisor.terminate_child, the cell’s process will be removed from the Cell.Supervisor supervisor, but the process may not be fully removed from the Cell.Registry.

This means our cells can potentially interact with “ghost neighbors”; neighboring cells who are in the process of dying, but are not quite completely dead.

Adding a Process.alive? filter prevents our cell from interacting with this ghost neighbors (and prevents a very frustrating, subtle bug).

Running the Simulation

Now that we’ve built our process-driven simulation, it’s time to test it out.

We can fire up our Game of Life environment by starting an interactive Elixir shell:


iex -S mix

Next, let’s spawn three cells in a row. This will create a “blinker” pattern:


Cell.sow({0, 0})
Cell.sow({1, 0})
Cell.sow({2, 0})

Now let’s fire up Erlang’s observer to get a high level view of our universe:


:observer.start

We can see the three cells we just added to the universe below the Cell.Supervisor supervision tree. Also notice that those processes are linked to the Cell.Registry process.

To test out our Cell.Supervisor, let’s manually kill one of our cell processes. Send a kill exit message to one of the cells, and notice that after the process dies, another process immediately takes its place.

This means that any unintended errors in a Cell process won’t bring down our entire life simulation. Awesome!


Now that our initial conditions are set up, let’s tick our universe:


Universe.tick

Switching back to our observer, we can see that two of the three cell processes have been removed and two new processes have been added. If we look at the state of these new processes, we’ll see that they live at positions {1, 1}, and {1, -1}, as expected.

If we tick our universe again, we would see that those two processes would be killed, and two new processes would be added in their place. Their positions would oscillate back to {0, 0} and {2, 0}. Notice that the process for the cell living at position {0, 1} is still alive and well.

We can tick our universe as many times as we want:


1..10_000
|> Enum.map(fn n -> Universe.tick end)

After all of the ticks are processed, we can switch back to our observer and see that we still have three living cells, as expected.

Let’s restart our universe and try again with a more interesting pattern. Let’s try a “diehard” pattern, which is a methuselah that dies after 130 generations:


[
                                                  {6, 2},
  {0, 1}, {1, 1},
          {1, 0},                         {5, 0}, {6, 0}, {7, 0},
]
|> Enum.map(&Cell.sow/1)

1..130
|> Enum.map(fn
              n -> Universe.tick
                   :timer.sleep(500)
            end)

If you watch your observer as it slowly runs through the each tick, you’ll see that the number of active processes skyrockets and then eventually fades to zero.

Final Thoughts

Truth be told, the goal of this project was to get deeper hands-on experience with Elixir processes, supervision trees, and the new Registry functionality.

At the end of the day it was an excellent experience. I learned quite a few important lessons the hard way. If you’re interested in learning Elixir or how to “think in processes”, I highly recommend you take on a similar project.

While this Game of Life implementation isn’t the fastest or most efficient, it does come with its interesting benefits.

It’s incredibly resilient. Every cell process, and the universe process can fail and restart seamlessly. Catastrophic failure can only take place if either the Universe.Supervisor, or the Cell.Supervisor fail, which is unlikely to happen.

It’s concurrent and parallel out of the box. Our asynchronous calls to Cell.tick are distributed across every CPU on our node. The Erlang VM automatically takes full advantage of its environment and orchestrates the running of these parallel processes.


As far as future work for this project, I have lots of ideas.

I’d like to give cells more independence, and remove the need for the Universe driver module. I imagine each cell automatically trying to progress into future generations as soon as all of its neighboring cells have all necessary information to do so.

I’d also like to spread the simulation across multiple nodes. I imagine a massive Game of Life simulation running on dozens of EC2 instances, orchestrated through an edeliver powered release.

Lastly, I’d like to give the simulation a simple web-based user interface, and potentially the ability to run multiple simulations at once.

If you’d like to take this project out for a test run, or get a better look at the source, be sure to check it out on Github!

My Favorite Pattern Without a Name

Written by Pete Corey on Jan 30, 2017.

My recent dives into Elixir have been changing how I write software. More and more I find myself writing in a functional style, even in languages where its not the norm.

As my style of programming changes, I find myself re-using patterns and ideas across many different languages.

This article is about one of the patterns I use most often; interestingly, a pattern without a clear name.

As my friend Dean Radcliffe pointed out, I'm probably describing a fluent interface as described by Martin Fowler. Today I learned.

An Example

I often find myself wanting to chain together many function calls, passing in the results of previous calls into the next function in the chain.

Sometimes, the next function in the chain only needs the result of the previous function. Sometimes, it needs the result of a function further back in the chain.

To make things more real, let’s consider an example. Imagine we’re building a system that receives and responds to text messages from users. In this example, we’ll have three functions:


def get_user(phone_number) do
  ...
end

def get_response(message) do
  ...
end

def send_response(user, response) do
  ...
end

The get_user function takes in the sender’s phone number and returns a corresponding user object. The get_response function takes in the sender’s text message and returns a response. send_response will take in a response and the user to send that response to.

One way of calling these three methods together might look like this:


user = get_user(sms.from)
response = get_response(sms.message)
send_response(user, response)

This is a fine solution. However, we’re binding user and response for no other reason than to pass them into send_response.

Additionally, if our methods evolve over time to need more inputs, or more intermediary values, our function calls will get more and more tangled up.

The Pattern in Elixir

Instead of writing each function to take in exactly the arguments it needs, and returning a single value, let’s broaden our scope.

Let’s create a larger, all-encompassing state object that will contain all inputs, intermediary values, and outputs of our function chain.

In our case, we’ll need the sms input, the user object, and our system’s response:


%{sms: sms, user: nil, response: nil}

We can then rewrite our get_user, get_response, and send_response methods to take our state object as an argument, and return a (potentially modified) state object as a result:


def get_user(state = %{sms: %{from: from}}) do
  ...
  %{state | user: ...}
end

def get_response(state = %{sms: %{message: message}}) do
  ...
  %{state | response: ...}
end

def send_response(state = %{user: user, response: response}) do
  ...
end

Now we can cleanly chain together all of our function calls:


%{sms: sms, user: nil, response: nil}
|> get_user
|> get_response
|> send_response

Beautiful!

What’s great is that we can easily inject new functions into our chain without worrying about juggling inputs and outputs. Let’s add a function to the end of our chain that logs the entire interaction:


%{sms: sms, user: nil, response: nil}
|> get_user
|> get_response
|> send_response
|> log_interaction

In a similar vein, we can easily add debug logging (or anything else) between each step of the chain to get an in-depth look at what’s happening between each call in the chain:


%{sms: sms, user: nil, response: nil}
|> get_user
|> IO.inspect    # debug logging
|> get_response
|> IO.inspect    # debug logging
|> send_response
|> IO.inspect    # debug logging

Now in Javascript

We can apply this same pattern in Javascript. The Lodash library gives us an excellent set of tools for chaining together function calls. Let’s put them to work.

Imagine we have a similar set of functions:


function get_user({sms: {from}}) {
  ...
}

function get_response({sms: {message}}) {
  ...
}

function send_response({user, response}) {
  ...
}

We can use Lodash to run through our chain, just like we did in Elixir:


_.chain({sms, user: undefined, response: undefined})
 .thru(get_user)
 .thru(get_response)
 .thru(send_response)
 .value();

Similarly, we can tap into any point of this chain to add debug logging:


_.chain({sms, user: undefined, response: undefined})
 .thru(get_user)
 .tap(console.debug)    // debug logging
 .thru(get_response)
 .tap(console.debug)    // debug logging
 .thru(send_response)
 .tap(console.debug)    // debug logging
 .value();

What’s in a Name

There are lots of names that we can throw at this pattern. Some of them stick better than others, but I’m not fully satisfied with any of them.

I asked in the Elixir slack channel if anyone had a name for this kind of thing:

Is there a generic name for building a large, all encompassing state object, and then passing it through a lot of different functions that might only work on a subset of that state? What Plug does, for example.

I find myself doing that kind of thing a lot, and I wonder if there’s some kind of name/pattern that describes it.

Dependency injection?

Yeah, maybe. I’d describe it as maybe DI + method chaining. DI seems like a weird thing to talk about in a purely functional setting. Of course dependencies will be injected.

I was trying to look at “popular" non-Elixir projects (like Redux or Om) which use the same concept to see if they identify the pattern by a name, but I’m not seeing anything.

Dependency injection (DI) might apply in a way, but I usually think of dependency injection in terms of passing around functionality, not data. In a pure functional context, everything could be considered dependency injection in that sense.

I mentioned “method chaining”, which seems to describe a nice side-effect of the pattern, not the pattern itself.


Maybe the best pattern name would be “Middleware”, as seen in libraries like Express, React, and Plug.

The pattern used by these libraries is exactly what I’ve been describing, but I’m not sure about the name. The term “middleware” was originally used to describe something very different than what we’ve been talking about.

On top of that, there’s some disagreement amongst software developers as to whether “middleware” is the term we should be applying to this pattern (or if this is even a pattern at all).

Either way, I’ll probably refer to this pattern as middleware when talking with other developers, and I’ll definitely continue using my favorite pattern without a name.

Upgrade Releases With Edeliver

Written by Pete Corey on Jan 23, 2017.

Last week we saw that using edeliver could drastically simplify our Elixir release process. Now that we can build and deploy our initial releases, we should investigate how edeliver can help us build, manage, and deploy upgrade releases.

To get a better idea of how upgrade releases work with edeliver, let’s make a change to our original application, build an upgrade release from that change, and then deploy that change to our production server.

Configuring edeliver

Before we start building our upgrade release, we’ll need to make a configuration change to Distillery.

Recently, Distillery changed where it stores its release artifacts. If left unchanged, this new output directory would break our edeliver release process:

==> Upgrading prod from 0.0.1 to 0.0.1+82e2ed7
==> Upgrade from 0.0.1 to 0.0.1+82e2ed7 failed:
  0.0.1 does not exist at _build/prod/rel/prod/releases/0.0.1

Thankfully, we can change where these artifacts are stored with Distillery’s output_dir configuration option.

Let’s edit rel/config.exs and set output_dir to the rel/<application name> directory:


set output_dir: "rel/hello_edeliver"

Now Distillery will store our release artifacts in rel/hello_edeliver, which is where edeliver expects to find them.

Making Our Change

Let’s make a small change to our application. We’ll update our index.html.eex and add a line indicating that we’ve upgraded the application.


<p>We've upgraded!</p>

At this point, we can either manually upgrade our application’s version in our mix.exs file, or we can let edeliver manage our versioning for us.

Let’s keep things as hands-off as possible and let edeliver manage our application’s version.

At this point, we can commit our changes and move on to building our upgrade.


git commit -am "Added upgrade message."

Building and Deploying Our Upgrade

Elixir upgrade releases are always built off on a previous release. Distillery will generate a patch that changes only what is different between the old and new version of the application.

With that in mind, we need to know what version of our application is currently running in production. We can find this out using edeliver’s version mix task:


mix edeliver version production

In our case, we can see that we’re running version 0.0.1 of our application in production.

Knowing that, we can generate our upgrade release. Upgrade releases are built with the build upgrade mix task:


mix edeliver build upgrade --with=0.0.1 --auto-version=git-revision

We’re using the --with flag to point to our previous 0.0.1 release. We’re also using the --auto-version flag to tell edeliver to automatically handle the versioning of our application.

This command will generate an upgrade release with a version similar to 0.0.1+82e2ed7. You can see that the git-revision option appended the current commit’s SHA to the end of the original release version.

For more information about upgrade flags and automatic versioning, be sure the check out the edeliver documentation and wiki.


Once our upgrade release has been built, deploying it to our production environment is a simple task:


mix edeliver deploy upgrade to production

That’s it! Our change was applied instantly to our running application with zero downtime.

If we check the version of our application running in production, we should see 0.0.1+82e2ed7:


mix edeliver version production

And lastly, when we view our application in the browser, we should see our upgrade message in the DOM.

Success!

Final Thoughts

After getting over a few configuration hurdles, using edeliver to build and deploy an upgrade release was a breeze.

Being able to seamlessly deploy an upgrade to any number of hosts with zero downtime is an incredibly powerful tool, and edeliver gives a polished interface for just this.

I’m looking forward to using edeliver more in the future.