What is NoSQL Injection?

Written by Pete Corey on Jul 3, 2017.

Progress on Inject Detect continues to chug along. I’ve been working on building out an educational section to hold a variety of articles and guides designed to help people better understand all things NoSQL Injection.

This week I put the finishing touches on two new articles: ”What is NoSQL Injection?”, and “How do you prevent NoSQL Injection?”.

For posterity, I’ve included both articles below.


What is NoSQL Injection?

NoSQL Injection is security vulnerability that lets attackers take control of database queries through the unsafe use of user input. It can be used by an attacker to:

  • Expose unauthorized information
  • Modify data
  • Escalate privileges
  • Take down your entire application

Over the past few years, we’ve worked with many teams building amazing software with Meteor and MongoDB. But to our shock and dismay, we’ve found NoSQL Injection vulnerabilities in nearly all of these projects.

An Example Application

Let’s make things more real by introducing an example to help us visualize how NoSQL Injection can occur, and the impact it can have on your application.

Imagine that our application accepts a username and a password hash from users attempting to log into the system. We check if the provided username/password combination is valid by searching for a user with both fields in our MongoDB database:


Meteor.methods({
    login(username, hashedPassword) {
        return Meteor.users.findOne({ username, hashedPassword });
    }
});

If the user provided a valid username, and that user’s corresponding hashedPassword, the login function will return that user’s document.

Exploiting Our Application

In this example, we’re assuming that username and hashedPassword are strings, but we’re not explicitly making that assertion anywhere in our code. A user could potentially pass up any type of data from the client, such as a string, a number, or even an object.

A particularly clever user might pass up "admin" as their username, and {$gte: ""} as their password. This combination would result in our login method making the following query:


db.users.findOne({ username: "admin", hashedPassword: {$gte: ""}})

This query will return the first document it finds with a username of "admin" and a hashed password that is greater an empty string. Regardless of the admin user’s password, their user document will be returned by this query.

Our clever user has successfully bypassed out authentication scheme by exploiting a NoSQL Injection vulnerability.


How do you prevent NoSQL Injection?

In our previous example, our code was making the assumption that the user-provided username and hashedPassword were strings. We ran into trouble when a malicious user passed up a MongoDB query operator as their hashedPassword.

Speaking in broad strokes, NoSQL Injection vulnerabilities can be prevented by making assertions about the types and shapes of your user-provided arguments. Instead of simply assuming that username and hashedPassword were strings, we should have made that assertion explicit in our code.

Using Checks

Meteor’s Check library can be used to make assertions about the type and shape of user-provided arguments. We can use check in our Meteor methods and publications to make sure that we’re dealing with expected data types.

Let’s secure our login method using Meteor’s check library:


Meteor.methods({
    login(username, hashedPassword) {
        check(username, String);
        check(hashedPassword, String);
        return Meteor.users.findOne({ username, hashedPassword });
    }
});

If a user passes in a username or a password that is anything other than a string, the one of the calls to check in our login method will throw an exception. This simple check stops NoSQL Injection attacks dead in their tracks.

Using Validated Methods

Meteor also gives us the option of writing our methods as Validated Methods. Validated methods incorporate this type of argument checking into the definition of the method itself.

Let’s implement our login method as a validated method:


new ValidatedMethod({
    name: "login",
    validate: new SimpleSchema({
        username: String,
        hashedPassword: String
    }).validator(),
    run({ username, hashedPassword }) {
        return Meteor.users.findOne({ username, hashedPassword });
    }
});

The general idea here is the same as our last example. Instead of using check, we’re using SimpleSchema to make assertions about the shape and types of our method’s arguments.

If a malicious user provides a username or a hashedPassword that is anything other than a string, the method will return an exception, preventing the possibility of NoSQL Injection attacks.

Distributed Systems Are Hard

Written by Pete Corey on Jun 26, 2017.

As I dive deeper and deeper into the world of Elixir and distributed systems in general, I’ve been falling deeper and deeper into a personal crisis.

I’ve been slowly coming to the realization that just about every production system I’ve worked on or built throughout my career is broken in one way or another.

Distributed systems are hard.

Horizontal Scaling is Easy, Right?

In the past, my solution to the problem of scale has always been to scale horizontally. By “scale horizontally”, I mean spinning up multiple instances of your server processes, either across multiple CPUs, or multiple machines, and distributing traffic between them.

As long as my server application doesn’t persist in-memory state across sessions, or persist anything to disk, it’s fair game for horizontal scaling. For the most part, this kind of shoot-from-the-hip horizontal scaling works fairly well…

Until it doesn’t.

Without careful consideration and deliberate design, “split it and forget it” scaling will eventually fail. It may not fail catastrophically - in fact, it will most likely fail in subtle, nuanced ways. But it will always fail.

This is the way the world ends
Not with a bang but a whimper.

Let’s take a look at how this type of scaling can break down and introduce heisenbugs into your system.

Scaling in Action

For the sake of discussion, imagine that we’re building a web application that groups users into teams. A rule, or invariant, of our system is that a user can only be assigned to a single team at a time.

Our system enforces this rule by checking if a user already belongs to a team before adding them to another:


function addUserToTeam(userId, teamId) {
    if (Teams.findOne({ userIds: userId })) {
        throw new Error("Already on a team!");
    }
    Teams.update({ _id: teamId }, { $push: { userIds: userId } });
}

This seems relatively straight-forward, and has worked beautifully in our small closed-beta trials.

Great! Over time, our Team Joiner™ application becomes very popular.

To meet the ever growing demand of new users wanting to join teams, we begin horizontally scaling our application by spinning up more instances of our server. However, as we add more servers, mysterious bugs begin to crop up…

Users are somehow, under unknown circumstances, joining multiple teams. That was supposed to be a premium feature!

With Our Powers Combined

The root of the problem stems from the fact that we have two (or more) instances of our server process running in parallel, without accounting for the existence of the other processes.

Imagine a scenario where a user, Sue, attempts to join Team A. Simultaneously, an admin user, John, notices that Sue isn’t on a team and decides to help by assigning her to Team B.

Sue’s request is handled entirely by Server A, and John’s request is handled entirely by Server B.

Diagram of conflict between Server A and Server B.

Server A begins by checking if Sue is on a team. She is not. Just after that, Server B also checks if Sue is on a team. She is not. At this point, both servers think they’re in the clear to add Sue to their respective team. Server A assigns Sue to Team A, fulfilling her request. Meanwhile, Server B assigns Sue to Team B, fulfilling John’s request.

Interestingly, both servers do their jobs flawlessly individually, while their powers combined put the system in an invalid, unpredictable, and potentially unrecoverable state.


The issue here is that between the point in time when Server B verifies that Sue is not on a team and the point when it assigns her to Team B, the state of the system changes.

Server B carries out its database update operating under the assumptions of old, stale data. The server process isn’t properly designed to handle, or even recognize these types of conflicting updates.

Interestingly (and horrifyingly), this isn’t the only type of bug that can result from this type of haphazard scaling.

Check out the beginning of Nathan Herald’s talk from this year’s ElixirConf EU to hear about all of the fantastic ways that distributed systems can fail.

Handling Conflicts

While this specific problem is somewhat contrived and could be easily fixed by a database schema that more accurately reflects the problem we’re trying to solve (by keeping teamId on the user document), it serves as a good platform to discuss the larger issue.

Distributed systems are hard.

When building distributed systems, you need to be prepared to be working with data that may be inconsistent or outdated. Conflicts should be an expected outcome that are designed into the system and strategically planned for.

This is part of the reason I’ve gravitated towards an Event Sourcing approach for my latest project, Inject Detect.

Events can be ordered sequentially in your database, and you can make assertions (with the help of database indexing) that the event you’re inserting immediately follows the last event you’ve seen.

We’ll dive into the details surrounding this type of solution in future posts.

Final Thoughts

Wrapping up, I feel like this article ranks high in fear-mongering and low in actionable value. That definitely isn’t my intention.

My goal is to show that working with distributed systems is unexpectedly hard. The moment you add a second CPU or spin up a new server instance, you’re entering a brave new (but really, not so new) world of computing that requires you to more deeply consider every line of code you write.

I encourage you to re-examine projects and code you’ve written that exist in a distributed environment. Have you ever experienced strange bugs that you can’t explain? Are there any race conditions lurking there that you’ve never considered?

Is your current application ready to be scaled horizontally? Are you sure?

In the future, I hope to write more actionable articles about solving these kinds of problems. Stay tuned for future posts on how Event Sourcing can be used to write robust, conflict-free distributed systems!

GenServers and Memory Images: A Match Made in Heaven

Written by Pete Corey on Jun 19, 2017.

My current project, Inject Detect, is being built with Elixir and makes heavy use of Martin Fowler-style Memory Images. After working with this setup for several months, I’ve come to realize that Elixir GenServers and a Memory Image architecture are a match made in heaven.

Let’s dive into what Memory Images are, and why GenServers are the perfect tool for building out a Memory Image in your application.

What is a Memory Image?

In my opinion, the best introduction to the Memory Image concept is Martin Fowler’s article on the subject. If you haven’t, be sure to read through the article.

For brevity, I’ll try to summarize as quickly as possible. Martin comments that most developers’ first question when starting a new project is, “what database will I use?” Unfortunately, answering this question requires many upfront decisions about things like data shape and even usage patterns that are often unknowable upfront.

Martin flips the question on its head. Instead of asking which database you should use, he suggests you ask yourself, “do I need a database at all?”

Mind blown.

The idea of a Memory Image is to keep the entire state of your application entirely within your server’s memory, rather than keeping it in a database. At first, this seems absurd. In reality, it actually works very well for many projects.

I’ll defer an explanation of the pros, cons, and my experiences with Memory Images to a later post. Instead of going down that rabbit hole, let’s take a look at how we can efficiently implement a Memory Image in Elixir!

Backed By an Event Log

The notion that a Memory Image architecture doesn’t use a database at all isn’t entirely true. In Inject Detect, I use a database to persist a log of events that describe all changes that have happened to the system since the beginning of time.

This event log isn’t particularly useful in its raw format. It can’t be queried in any meaningful way, and it can’t be used to make decisions about the current state of the system.

To get something more useful out of the system, the event log needs to be replayed. Each event effects the system’s state in some known way. By replaying these events and their corresponding effects in order, we can rebuild the current state of the system. We effectively reduce down all of the events in our event log into the current state of our system.

This is Event Sourcing.


We can implement this kind of simplified Event Sourced system fairly easily:


defmodule State do

  def get do
    state = InjectDetect.Model.Event
    |> order_by([event], event.id)
    |> InjectDetect.Repo.all
    |> Enum.to_list
    |> Enum.map(&(struct(String.to_atom(&1.type), &1.data))
    |> Enum.reduce(%{}, &State.Reducer.apply/2)
  end

end

Each event in our event log has a type field that points to a specific event struct in our application (like SignedUp), and a data field that holds a map of all the information required to replay the effects of that event on the system.

For example, a SignedUp event might look like this when saved to the database:


%{id: 123, type: "SignedUp", data: %{"email" => "user@example.com"}}

To get the current state of the system, we grab all events in our event log, convert them into structs, and then reduce them down into a single state object by applying their changes, one after the other, using our State.Reducer.apply Elixir protocol that all event structs are required to implement.

While this is a fairly simple concept, it’s obviously inefficient. Imagine having to process your entire event log every time you want to inspect the state of your system!

There has to be a better way.

GenServer, Meet Memory Image

Memory Image, meet GenServer.

Rather than reprocessing our entire event log every time we want to inspect our application’s state, what if we could just keep the application state in memory?

GenServers (and Elixir processes in general) are excellent tools for persisting state in memory. Let’s refactor our previous solution to calculate our application’s state and then store it in memory for future use.

To manage this, our GenServer will need to store two pieces of information. It will need to store the current state of the system, and the id of the last event that was processed. Initially, our current application state will be an empty map, and the last id we’ve seen will be 0:


  def start_link, do:
    GenServer.start_link(__MODULE__, { %{}, 0 }, name: __MODULE__)

Next, rather than fetching all events from our event log, we want to fetch only the events that have happened after the last event id that we’ve processed:


  defp get_events_since(id) do
    events = InjectDetect.Model.Event
    |> where([event], event.id > ^id)
    |> order_by([event], event.id)
    |> InjectDetect.Repo.all
    |> Enum.to_list
    {convert_to_structs(events), get_last_event_id(events)}
  end

This function returns a tuple of the fetched events, along with the id of the last event in that list.

When get_events_since is first called, it will return all events currently in the event log. Any subsequent calls will only return the events that have happened after the last event we’ve processed. Because we’re storing the system’s state in our GenServer, we can apply these new events to the old state to get the new current state of the system.

Tying these pieces together, we get something like this:


defmodule State do
  use GenServer

  import Ecto.Query

  def start_link, do: 
    GenServer.start_link(__MODULE__, { %{}, 0 }, name: __MODULE__)
 
  def get, do: 
    GenServer.call(__MODULE__, :get)

  def convert_to_structs(events), do: 
    Enum.map(events, &(struct(String.to_atom(&1.type), &1.data))

  def get_last_event_id(id, events) do
    case List.last(events) do
      nil   -> id
      event -> event.id
    end
  end

  defp get_events_since(id) do
    events = InjectDetect.Model.Event
    |> where([event], event.id > ^id)
    |> order_by([event], event.id)
    |> InjectDetect.Repo.all
    |> Enum.to_list
    {convert_to_structs(events), get_last_event_id(id, events)}
  end

  def handle_call(:get, _, {state, last_id}) do
    {events, last_id} = get_events_since(last_id)
    state = Enum.reduce(events, state, &State.Reducer.apply/2)
    {:reply, {:ok, state}, {state, last_id}}
  end

end

At first this solution may seem complicated, but when we break it down, there’s not a whole lot going on.

Our State GenServer stores:

  1. The current state of the system.
  2. The id of the last event it has processed.

Whenever we call State.get, it checks for new events in the event log and applies them, in order, to the current state. The GenServer saves this state and the id of the last new event and then replies with the new state.

That’s it!

Final Thoughts

Building a Memory Image in Elixir using GenServers is a match made in heaven. When working with these tools and techniques, it honestly feels like solutions effortlessly fall into place.

The Memory Image architecture, especially when combined with Event Sourcing, perfectly lends itself to a functional approach. Additionally, using GenServers to implement these ideas opens the doors to building fast, efficient, fault-tolerant, and consistent distributed systems with ease.

While Memory Images are an often overlooked solution to the problem of maintaining state, the flexibility and speed they bring to the table should make them serious contenders in your next project.