Instrumenting Phoenix with Telemetry Part IV: Erlang VM Measurements with telemetry_poller
By Sophie DeBenedetto | Posted 2020-05-13
In this series, we’re instrumenting a Phoenix app and sending metrics to StatsD with the help of Elixir and Erlang’s Telemetry offerings.
In Part III we’ll incorporate Erlang’s telemetry_poller
library into our Phoenix app so that we can observe and report on Erlang VM Telemetry events.
Table Of Contents
In this series, we’re instrumenting a Phoenix app and sending metrics to StatsD with the help of Elixir and Erlang’s Telemetry offerings.
- Part I: Telemetry Under The Hood
-
Part II: Handling Telemetry Events with
TelemetryMetrics
+TelemetryMetricsStatsd
- Part III: Observing Phoenix + Ecto Telemetry Events
-
Part IV: Erlang VM Measurements with
telemetry_poller
,TelemetryMetrics
+TelemetryMetricsStatsd
Intro
In the previous post we used Telemetry.Metrics
to define metrics for a number of out-of-the-box Phoenix and Ecto Telemetry events and used TelemetryMetricsStatsd
to handle and report those events to StatsD.
In this post, we’ll incorporate Erlang’s telemetry_poller
library into our Phoenix app so that we can observe and report on Erlang VM Telemetry events.
Getting Started
You can follow along with this tutorial by cloning down the repo here.
- Checking out the starting state of our code on the branch part-4-start
- Find the solution code on the branch part-4-solution
Overview
In order to report on Erlang VM measurements as metrics, we will:
-
Install the
telemetry_poller
dependency -
Define metrics for
telemetry_poller
Telemetry events usingTelemetry.Metrics
- That’s it!
Step 1: Installing telemetry_poller
First, we’ll include the telemetry_poller
dependency in our app and run mix deps.get
# mix.exs
def deps do
{:telemetry_poller, "~> 0.4"}
end
Step 2: Defining Metrics for telemetry_poller
Events
telemetry_poller
Telemetry Events
When our app starts up, the telemetry_poller
app will also start running. This app will poll the Erlang VM to take the following measurements and execute these measurements as Telemetry events:
-
Memory - Measurement of the memory used by the Erlang VM
-
Total run queue lengths - Measurement of the queue of tasks to be scheduled by the Erlang scheduler. This event will be executed with a measurement map describing:
-
total
- a sum of all run queue lengths -
cpu
- a sum of CPU schedulers’ run queue lengths -
io
- length of dirty IO run queue. It’s always 0 if running on Erlang versions prior to 20.
-
-
System count - Measurement of number of processes currently existing at the local node, the number of atoms currently existing at the local node and the number of ports currently existing at the local node
-
Process info - A measurement with information about a given process, for example a worker in your application
Let’s define metrics for some of these events in our Quantum.Telemetry
module.
Defining Our Metrics
The Telemetry.Metrics.last_value/2
function defines a metric that holds the value of the selected measurement from the most recent event. The TelemetryMetricsStatsd
reporter will send such a metric to StatsD as a “gauge” metric. Let’s define a set of gauge metrics for some of the Telemetry events mentioned above:
# lib/quantum/telemetry.ex
defp metrics do
[
# VM Metrics - gauge
last_value("vm.memory.total", unit: :byte),
last_value("vm.total_run_queue_lengths.total"),
last_value("vm.total_run_queue_lengths.cpu"),
last_value("vm.system_counts.process_count")
]
end
Now, when telemetry_poller
executes the corresponding events, we will see the following metrics to StatsD sent to StatsD:
gauges: {
'vm.memory.total': 49670008,
'vm.total_run_queue_lengths.total': 0,
'vm.total_run_queue_lengths.cpu': 0,
'vm.system_counts.process_count': 366
}
And that’s it! Before we wrap up, let’s take a look under the hood of the telemetry_poller
library.
telemetry_poller
Under The Hood
Taking a look at some source code, we can see exactly how telemetry_poller
is executing these events.
The memory/0
Function
The memory/0
function grabs memory measurements with a call to erlang:memory/0
and passes those measurements to the call to telemetry:execute/3
as the measurements map:
% telemetry_poller/src/telemetry_poller_builtin.erl
memory() ->
Measurements = erlang:memory(),
telemetry:execute([vm, memory], maps:from_list(Measurements), #{}).
Let’s break this down further. We can examine the measurements returned from the call to erlang:memory()
by trying it out ourselves in iex
:
iex(1)> :erlang.memory()
[
total: 28544704,
processes: 5268240,
processes_used: 5267272,
system: 23276464,
atom: 339465,
atom_used: 317752,
binary: 58656,
code: 5688655,
ets: 515456
]
We can see that is contains a key of :total
, pointing to a value of the total amount of memory allocated to the Erlang VM.
Thus, a Telemetry event is executed with the name [vm, memory]
and a set of measurements including this total. When we invoke our Telemetry.Metrics.last_value/2
function, we are telling our reporter, TelemetryStatsD
, to attach a handler for this event and to respond to it by constructing a gauge metric with the value of the :total
key included in the provided measurements:
# lib/quantum/telemetry.ex
defp metrics do
[
last_value("vm.memory.total", unit: :byte)
]
end
The total_run_queue_lengths/0
Function
The total_run_queue_lengths/0
function measures the total VM run queue length, as well as the total CPU schedulers’ run queue length and passes those measurements to a call to telemetry:execute/3
:
% telemetry_poller/src/telemetry_poller_builtin.erl
total_run_queue_lengths() ->
Total = cpu_stats(total),
CPU = cpu_stats(cpu),
telemetry:execute([vm, total_run_queue_lengths], #{
total => Total,
cpu => CPU,
io => Total - CPU},
#{}).
To observe this event, we are specifying that our Telemetry pipeline attach a handler for the [vm, total_run_queue_lengths]
event and constructing two gauge metrics for every such event that is executed–one with the value of the total
measurement and one with the value of the cpu
measurement:
# lib/quantum/telemetry.ex
defp metrics do
[
last_value("vm.total_run_queue_lengths.total"),
last_value("vm.total_run_queue_lengths.cpu")
]
end
The system_counts/0
Function
The system_counts/0
function takes measurements including the total process count and executes a Telemetry event with these measurements via a call to telemetry:execute/3
:
system_counts() ->
ProcessCount = erlang:system_info(process_count),
PortCount = erlang:system_info(port_count),
telemetry:execute([vm, system_counts], #{
process_count => ProcessCount,
port_count => PortCount
}).
To observe this event, we are specifying that our Telemetry pipeline attach a handler for the [vm, system_counts]
event and construct a gauge metric with the value of the process_count
measurement for every such event:
# lib/quantum/telemetry.ex
defp metrics do
[
last_value("vm.system_counts.process_count")
]
end
Polling for Custom Measurements
You can also use the telemetry_poller
library to emit measurements describing custom processes or workers running in your app, or to emit custom measurements. See the docs here for more info.
Conclusion
Once again we’ve seen that Erlang and Elixir’s family of Telemetry libraries make it easy for us to achieve fairly comprehensive instrumentation with very little hand-rolled code. By adding the telemetry_poller
library to our dependencies, we’re ensuring our application will execute a set of Telemetry events describing Erlang VM measurements at regular intervals. We’re observing these events, formatting them and sending them to StatsD with the help of Telemetry.Metrics
and TelemetryMetricsStatsd
, allowing us to paint an even fuller picture of the state of our Phoenix app at any given time.