Instrumenting Phoenix with Telemetry Part IV: Erlang VM Measurements with `telemetry_poller`

In this series, we’re instrumenting a Phoenix app and sending metrics to StatsD with the help of Elixir and Erlang’s Telemetry offerings. In Part III we’ll incorporate Erlang’s telemetry_poller library into our Phoenix app so that we can observe and report on Erlang VM Telemetry events.

In this series, we’re instrumenting a Phoenix app and sending metrics to StatsD with the help of Elixir and Erlang’s Telemetry offerings.

Part I: Telemetry Under The Hood
Part II: Handling Telemetry Events with TelemetryMetrics + TelemetryMetricsStatsd
Part III: Observing Phoenix + Ecto Telemetry Events
Part IV: Erlang VM Measurements with telemetry_poller, TelemetryMetrics + TelemetryMetricsStatsd

Intro

In the previous post we used Telemetry.Metrics to define metrics for a number of out-of-the-box Phoenix and Ecto Telemetry events and used TelemetryMetricsStatsd to handle and report those events to StatsD.

In this post, we’ll incorporate Erlang’s telemetry_poller library into our Phoenix app so that we can observe and report on Erlang VM Telemetry events.

Getting Started

You can follow along with this tutorial by cloning down the repo here.

Checking out the starting state of our code on the branch part-4-start
Find the solution code on the branch part-4-solution

Overview

In order to report on Erlang VM measurements as metrics, we will:

Install the telemetry_poller dependency
Define metrics for telemetry_poller Telemetry events using Telemetry.Metrics
That’s it!

Step 1: Installing `telemetry_poller`

First, we’ll include the telemetry_poller dependency in our app and run mix deps.get

# mix.exs
def deps do
  {:telemetry_poller, "~> 0.4"}
end

Step 2: Defining Metrics for `telemetry_poller` Events

`telemetry_poller` Telemetry Events

When our app starts up, the telemetry_poller app will also start running. This app will poll the Erlang VM to take the following measurements and execute these measurements as Telemetry events:

Memory - Measurement of the memory used by the Erlang VM
Total run queue lengths - Measurement of the queue of tasks to be scheduled by the Erlang scheduler. This event will be executed with a measurement map describing:
- total - a sum of all run queue lengths
- cpu - a sum of CPU schedulers’ run queue lengths
- io - length of dirty IO run queue. It’s always 0 if running on Erlang versions prior to 20.
System count - Measurement of number of processes currently existing at the local node, the number of atoms currently existing at the local node and the number of ports currently existing at the local node
Process info - A measurement with information about a given process, for example a worker in your application

Let’s define metrics for some of these events in our Quantum.Telemetry module.

Defining Our Metrics

The Telemetry.Metrics.last_value/2 function defines a metric that holds the value of the selected measurement from the most recent event. The TelemetryMetricsStatsd reporter will send such a metric to StatsD as a “gauge” metric. Let’s define a set of gauge metrics for some of the Telemetry events mentioned above:

# lib/quantum/telemetry.ex

defp metrics do
  [
    # VM Metrics - gauge
    last_value("vm.memory.total", unit: :byte),
    last_value("vm.total_run_queue_lengths.total"),
    last_value("vm.total_run_queue_lengths.cpu"),
    last_value("vm.system_counts.process_count")
  ]
end

Now, when telemetry_poller executes the corresponding events, we will see the following metrics to StatsD sent to StatsD:

gauges: {
  'vm.memory.total': 49670008,
  'vm.total_run_queue_lengths.total': 0,
  'vm.total_run_queue_lengths.cpu': 0,
  'vm.system_counts.process_count': 366
}

And that’s it! Before we wrap up, let’s take a look under the hood of the telemetry_poller library.

`telemetry_poller` Under The Hood

Taking a look at some source code, we can see exactly how telemetry_poller is executing these events.

The `memory/0` Function

The memory/0 function grabs memory measurements with a call to erlang:memory/0 and passes those measurements to the call to telemetry:execute/3 as the measurements map:

% telemetry_poller/src/telemetry_poller_builtin.erl

memory() ->
    Measurements = erlang:memory(),
    telemetry:execute([vm, memory], maps:from_list(Measurements), #{}).

Let’s break this down further. We can examine the measurements returned from the call to erlang:memory() by trying it out ourselves in iex:

iex(1)> :erlang.memory()
[
  total: 28544704,
  processes: 5268240,
  processes_used: 5267272,
  system: 23276464,
  atom: 339465,
  atom_used: 317752,
  binary: 58656,
  code: 5688655,
  ets: 515456
]

We can see that is contains a key of :total, pointing to a value of the total amount of memory allocated to the Erlang VM.

Thus, a Telemetry event is executed with the name [vm, memory] and a set of measurements including this total. When we invoke our Telemetry.Metrics.last_value/2 function, we are telling our reporter, TelemetryStatsD, to attach a handler for this event and to respond to it by constructing a gauge metric with the value of the :total key included in the provided measurements:

# lib/quantum/telemetry.ex

defp metrics do
  [
    last_value("vm.memory.total", unit: :byte)
  ]
end

The `total_run_queue_lengths/0` Function

The total_run_queue_lengths/0 function measures the total VM run queue length, as well as the total CPU schedulers’ run queue length and passes those measurements to a call to telemetry:execute/3:

% telemetry_poller/src/telemetry_poller_builtin.erl

total_run_queue_lengths() ->
    Total = cpu_stats(total),
    CPU = cpu_stats(cpu),
    telemetry:execute([vm, total_run_queue_lengths], #{
        total => Total,
        cpu => CPU,
        io => Total - CPU},
        #{}).

To observe this event, we are specifying that our Telemetry pipeline attach a handler for the [vm, total_run_queue_lengths] event and constructing two gauge metrics for every such event that is executed–one with the value of the total measurement and one with the value of the cpu measurement:

# lib/quantum/telemetry.ex

defp metrics do
  [
    last_value("vm.total_run_queue_lengths.total"),
    last_value("vm.total_run_queue_lengths.cpu")
  ]
end

The `system_counts/0` Function

The system_counts/0 function takes measurements including the total process count and executes a Telemetry event with these measurements via a call to telemetry:execute/3:

system_counts() ->
    ProcessCount = erlang:system_info(process_count),
    PortCount = erlang:system_info(port_count),
    telemetry:execute([vm, system_counts], #{
        process_count => ProcessCount,
        port_count => PortCount
    }).

To observe this event, we are specifying that our Telemetry pipeline attach a handler for the [vm, system_counts] event and construct a gauge metric with the value of the process_count measurement for every such event:

# lib/quantum/telemetry.ex

defp metrics do
  [
    last_value("vm.system_counts.process_count")
  ]
end

Polling for Custom Measurements

You can also use the telemetry_poller library to emit measurements describing custom processes or workers running in your app, or to emit custom measurements. See the docs here for more info.

Conclusion

Once again we’ve seen that Erlang and Elixir’s family of Telemetry libraries make it easy for us to achieve fairly comprehensive instrumentation with very little hand-rolled code. By adding the telemetry_poller library to our dependencies, we’re ensuring our application will execute a set of Telemetry events describing Erlang VM measurements at regular intervals. We’re observing these events, formatting them and sending them to StatsD with the help of Telemetry.Metrics and TelemetryMetricsStatsd, allowing us to paint an even fuller picture of the state of our Phoenix app at any given time.

Instrumenting Phoenix with Telemetry Part IV: Erlang VM Measurements with `telemetry_poller`

By Sophie DeBenedetto | Posted 2020-05-13

Table Of Contents

Intro

Getting Started

Overview

Step 1: Installing `telemetry_poller`

Step 2: Defining Metrics for `telemetry_poller` Events

`telemetry_poller` Telemetry Events

Defining Our Metrics

`telemetry_poller` Under The Hood

The `memory/0` Function

The `total_run_queue_lengths/0` Function

The `system_counts/0` Function

Polling for Custom Measurements

Conclusion

Instrumenting Phoenix with Telemetry Part IV: Erlang VM Measurements with telemetry_poller

By Sophie DeBenedetto | Posted 2020-05-13

Table Of Contents

Intro

Getting Started

Overview

Step 1: Installing telemetry_poller

Step 2: Defining Metrics for telemetry_poller Events

telemetry_poller Telemetry Events

Defining Our Metrics

telemetry_poller Under The Hood

The memory/0 Function

The total_run_queue_lengths/0 Function

The system_counts/0 Function

Polling for Custom Measurements

Conclusion

Instrumenting Phoenix with Telemetry Part IV: Erlang VM Measurements with `telemetry_poller`

Step 1: Installing `telemetry_poller`

Step 2: Defining Metrics for `telemetry_poller` Events

`telemetry_poller` Telemetry Events

`telemetry_poller` Under The Hood

The `memory/0` Function

The `total_run_queue_lengths/0` Function

The `system_counts/0` Function