Do you want to pick up from where you left of?
Take me there

Erlang Term Storage (ETS)

Erlang Term Storage, commonly referred to as ETS, is a powerful storage engine built into OTP and available to use in Elixir. In this lesson we’ll look at how to interface with ETS and how it can be employed in our applications.

Overview

ETS is a robust in-memory store for Elixir and Erlang objects that comes included. ETS is capable of storing large amounts of data and offers constant time data access.

Tables in ETS are created and owned by individual processes. When an owner process terminates, its tables are destroyed. You can have as many ETS table as you want, the only limit is the server memory. A limit can be specified using the ERL_MAX_ETS_TABLES environment variable.

Creating Tables

Tables are created with new/2, which accepts a table name, and a set of options, and returns a table identifier that we can use in subsequent operations.

For our example we’ll create a table to store and look up users by their nickname:

iex> table = :ets.new(:user_lookup, [:set, :protected])
8212

Much like GenServers, there is a way to access ETS tables by name rather than identifier. To do this we need to include the :named_table option. Then we can access our table directly by name:

iex> :ets.new(:user_lookup, [:set, :protected, :named_table])
:user_lookup

Table Types

There are four types of tables available in ETS:

Access Controls

Access control in ETS is similar to access control within modules:

Race Conditions

If more than one process can write to a table - whether via :public access or by messages to the owning process - race conditions are possible. For example, two processes each read a counter value of 0, increment it, and write 1; the end result reflects only a single increment.

For counters specifically, :ets.update_counter/3 provides for atomic update-and-read. For other cases, it may be necessary for the owner process to perform custom atomic operations in response to messages, such as “add this value to the list at key :results“.

Inserting data

ETS has no schema. The only limitation is that data must be stored as a tuple whose first element is the key. To add new data we can use insert/2:

iex> :ets.insert(:user_lookup, {"doomspork", "Sean", ["Elixir", "Ruby", "Java"]})
true

When we use insert/2 with a set or ordered_set existing data will be replaced. To avoid this there is insert_new/2 which returns false for existing keys:

iex> :ets.insert_new(:user_lookup, {"doomspork", "Sean", ["Elixir", "Ruby", "Java"]})
false
iex> :ets.insert_new(:user_lookup, {"3100", "", ["Elixir", "Ruby", "JavaScript"]})
true

Data Retrieval

ETS offers us a few convenient and flexible ways to retrieve our stored data. We’ll look at how to retrieve data by key and through different forms of pattern matching.

The most efficient, and ideal, retrieval method is key lookup. While useful, matching iterates through the table and should be used sparingly especially for very large data sets.

Key Lookup

Given a key, we can use lookup/2 to retrieve all records with that key:

iex> :ets.lookup(:user_lookup, "doomspork")
[{"doomspork", "Sean", ["Elixir", "Ruby", "Java"]}]

Simple Matches

ETS was built for Erlang, so be warned that match variables may feel a little clunky.

To specify a variable in our match we use the atoms :"$1", :"$2", :"$3", and so on. The variable number reflects the result position and not the match position. For values we’re not interested in, we use the :_ variable.

Values can also be used in matching, but only variables will be returned as part of our result. Let’s put it all together and see how it works:

iex> :ets.match(:user_lookup, {:"$1", "Sean", :_})
[["doomspork"]]

Let’s look at another example to see how variables influence the resulting list order:

iex> :ets.match(:user_lookup, {:"$99", :"$1", :"$3"})
[["Sean", ["Elixir", "Ruby", "Java"], "doomspork"],
 ["", ["Elixir", "Ruby", "JavaScript"], "3100"]]

What if we want our original object, not a list? We can use match_object/2, which regardless of variables returns our entire object:

iex> :ets.match_object(:user_lookup, {:"$1", :_, :"$3"})
[{"doomspork", "Sean", ["Elixir", "Ruby", "Java"]},
 {"3100", "", ["Elixir", "Ruby", "JavaScript"]}]

iex> :ets.match_object(:user_lookup, {:_, "Sean", :_})
[{"doomspork", "Sean", ["Elixir", "Ruby", "Java"]}]

Advanced Lookup

We learned about simple match cases but what if we want something more akin to an SQL query? Thankfully there is a more robust syntax available to us. To lookup our data with select/2 we need to construct a list of tuples with arity 3. These tuples represent our pattern, zero or more guards, and a return value format.

Our match variables and two new variables, :"$$" and :"$_", can be used to construct the return value. These new variables are shortcuts for the result format; :"$$" gets results as lists and :"$_" gets the original data objects.

Let’s take one of our previous match/2 examples and turn it into a select/2:

iex> :ets.match_object(:user_lookup, {:"$1", :_, :"$3"})
[{"doomspork", "Sean", ["Elixir", "Ruby", "Java"]},
 {"3100", "", ["Elixir", "Ruby", "JavaScript"]}]

iex> :ets.select(:user_lookup, [{{:"$1", :_, :"$3"}, [], [:"$_"]}])
[{"doomspork", "Sean", ["Elixir", "Ruby", "Java"]},
 {"3100", "", ["Elixir", "Ruby", "JavaScript"]}]

Although select/2 allows for finer control over what and how we retrieve records, the syntax is quite unfriendly and will only become more so. To handle this the ETS module includes fun2ms/1, which turns the functions into match_specs. With fun2ms/1 we can create queries using a familiar function syntax.

Let’s use fun2ms/1 and select/2 to find all usernames with more than 2 languages:

iex> fun = :ets.fun2ms(fn {username, _, langs} when length(langs) > 2 -> username end)
{% raw %}[{{:"$1", :_, :"$2"}, [{:>, {:length, :"$2"}, 2}], [:"$1"]}]{% endraw %}

iex> :ets.select(:user_lookup, fun)
["doomspork", "3100"]

Want to learn more about the match specification? Check out the official Erlang documentation for match_spec.

Deleting Data

Removing Records

Deleting terms is as straightforward as insert/2 and lookup/2. With delete/2 we only need our table and the key. This deletes both the key and its values:

iex> :ets.delete(:user_lookup, "doomspork")
true

Removing Tables

ETS tables are not garbage collected unless the parent is terminated. Sometimes it may be necessary to delete an entire table without terminating the owner process. For this we can use delete/1:

iex> :ets.delete(:user_lookup)
true

Example ETS Usage

Given what we’ve learned above, let’s put everything together and build a simple cache for expensive operations. We’ll implement a get/4 function to take a module, function, arguments, and options. For now the only option we’ll worry about is :ttl.

For this example we’re assuming the ETS table has been created as part of another process, such as a supervisor:

defmodule SimpleCache do
  @moduledoc """
  A simple ETS based cache for expensive function calls.
  """

  @doc """
  Retrieve a cached value or apply the given function caching and returning
  the result.
  """
  def get(mod, fun, args, opts \\ []) do
    case lookup(mod, fun, args) do
      nil ->
        ttl = Keyword.get(opts, :ttl, 3600)
        cache_apply(mod, fun, args, ttl)

      result ->
        result
    end
  end

  @doc """
  Lookup a cached result and check the freshness
  """
  defp lookup(mod, fun, args) do
    case :ets.lookup(:simple_cache, [mod, fun, args]) do
      [result | _] -> check_freshness(result)
      [] -> nil
    end
  end

  @doc """
  Compare the result expiration against the current system time.
  """
  defp check_freshness({mfa, result, expiration}) do
    cond do
      expiration > :os.system_time(:seconds) -> result
      :else -> nil
    end
  end

  @doc """
  Apply the function, calculate expiration, and cache the result.
  """
  defp cache_apply(mod, fun, args, ttl) do
    result = apply(mod, fun, args)
    expiration = :os.system_time(:seconds) + ttl
    :ets.insert(:simple_cache, {[mod, fun, args], result, expiration})
    result
  end
end

To demonstrate the cache we’ll use a function that returns the system time and a TTL of 10 seconds. As you’ll see in the example below, we get the cached result until the value has expired:

defmodule ExampleApp do
  def test do
    :os.system_time(:seconds)
  end
end

iex> :ets.new(:simple_cache, [:named_table])
:simple_cache
iex> ExampleApp.test
1451089115
iex> SimpleCache.get(ExampleApp, :test, [], ttl: 10)
1451089119
iex> ExampleApp.test
1451089123
iex> ExampleApp.test
1451089127
iex> SimpleCache.get(ExampleApp, :test, [], ttl: 10)
1451089119

After 10 seconds if we try again we should get a fresh result:

iex> ExampleApp.test
1451089131
iex> SimpleCache.get(ExampleApp, :test, [], ttl: 10)
1451089134

As you see we are able to implement a scalable and fast cache without any external dependencies and this is only one of many uses for ETS.

Disk-based ETS

We now know ETS is for in-memory term storage but what if we need disk-based storage? For that we have Disk Based Term Storage, or DETS for short. The ETS and DETS APIs are interchangeable with the exception of how tables are created. DETS relies on open_file/2 and doesn’t require the :named_table option:

iex> {:ok, table} = :dets.open_file(:disk_storage, [type: :set])
{:ok, :disk_storage}
iex> :dets.insert_new(table, {"doomspork", "Sean", ["Elixir", "Ruby", "Java"]})
true
iex> select_all = :ets.fun2ms(&(&1))
[{:"$1", [], [:"$1"]}]
iex> :dets.select(table, select_all)
[{"doomspork", "Sean", ["Elixir", "Ruby", "Java"]}]

If you exit iex and look in your local directory, you’ll see a new file disk_storage:

$ ls | grep -c disk_storage
1

One last thing to note is that DETS does not support ordered_set like ETS, only set, bag, and duplicate_bag.

Caught a mistake or want to contribute to the lesson? Edit this lesson on GitHub!