Perfetto Tracing

Mesa has experimental support for Perfetto for GPU performance monitoring. Perfetto supports multiple producers each with one or more data-sources. Perfetto already provides various producers and data-sources for things like:

  • CPU scheduling events (linux.ftrace)

  • CPU frequency scaling (linux.ftrace)

  • System calls (linux.ftrace)

  • Process memory utilization (linux.process_stats)

As well as various domain specific producers.

The mesa Perfetto support adds additional producers, to allow for visualizing GPU performance (frequency, utilization, performance counters, etc) on the same timeline, to better understand and tune/debug system level performance:

  • pps-producer: A systemwide daemon that can collect global performance counters.

  • mesa: Per-process producer within mesa to capture render-stage traces on the GPU timeline, track events on the CPU timeline, etc.

The exact supported features vary per driver:

Supported data-sources

Driver

PPS Counters

Render Stages

Freedreno

gpu.counters.msm

gpu.renderstages.msm

Turnip

gpu.counters.msm

gpu.renderstages.msm

Intel

gpu.counters.i915

gpu.renderstages.intel

Panfrost

gpu.counters.panfrost

V3D

gpu.counters.v3d

Run

To capture a trace with Perfetto you need to take the following steps:

  1. Build Perfetto from sources available at subprojects/perfetto following this guide.

  2. Create a trace config, which is a json formatted text file with extension .cfg, or use one of the config files under the src/tool/pps/cfg directory. More examples of config files can be found in subprojects/perfetto/test/configs.

  3. Change directory to subprojects/perfetto and run a convenience script to start the tracing service:

    cd subprojects/perfetto
    CONFIG=<path/to/gpu.cfg> OUT=out/linux_clang_release ./tools/tmux -n
    
  4. Start other producers you may need, e.g. pps-producer.

  5. Start perfetto under the tmux session initiated in step 3.

  6. Once tracing has finished, you can detach from tmux with Ctrl+b, d, and the convenience script should automatically copy the trace files into $HOME/Downloads.

  7. Go to ui.perfetto.dev and upload $HOME/Downloads/trace.protobuf by clicking on Open trace file.

  8. Alternatively you can open the trace in AGI (which despite the name can be used to view non-android traces).

To be a bit more explicit, here is a listing of commands reproducing the steps above :

# Configure Mesa with perfetto
mesa $ meson . build -Dperfetto=true -Dvulkan-drivers=intel,broadcom -Dgallium-drivers=
# Build mesa
mesa $ meson compile -C build

# Within the Mesa repo, build perfetto
mesa $ cd subprojects/perfetto
perfetto $ ./tools/install-build-deps
perfetto $ ./tools/gn gen --args='is_debug=false' out/linux
perfetto $ ./tools/ninja -C out/linux

# Start perfetto
perfetto $ CONFIG=../../src/tool/pps/cfg/gpu.cfg OUT=out/linux/ ./tools/tmux -n

# In parallel from the Mesa repo, start the PPS producer
mesa $ ./build/src/tool/pps/pps-producer

# Back in the perfetto tmux, press enter to start the capture

CPU Tracing

Mesa’s CPU tracepoints (MESA_TRACE_*) use Perfetto track events when Perfetto is enabled. They use mesa.default and mesa.slow categories.

Currently, only EGL and the following drivers have have CPU tracepoints.

  • Freedreno

  • V3D

  • VC4

Vulkan data sources

The Vulkan API gives the application control over recording of command buffers as well as when they are submitted to the hardware. As a consequence, we need to ensure command buffers are properly instrumented for the Perfetto driver data sources prior to Perfetto actually collecting traces.

This can be achieved by setting the MESA_GPU_TRACES environment variable before starting a Vulkan application :

MESA_GPU_TRACES=perfetto ./build/my_vulkan_app

Driver Specifics

Below is driver specific information/instructions for the PPS producer.

Freedreno / Turnip

The Freedreno PPS driver needs root access to read system-wide performance counters, so you can simply run it with sudo:

sudo ./build/src/tool/pps/pps-producer

Intel

The Intel PPS driver needs root access to read system-wide RenderBasic performance counters, so you can simply run it with sudo:

sudo ./build/src/tool/pps/pps-producer

Another option to enable access wide data without root permissions would be running the following:

sudo sysctl dev.i915.perf_stream_paranoid=0

Alternatively using the CAP_PERFMON permission on the binary should work too.

A particular metric set can also be selected to capture a different set of HW counters :

INTEL_PERFETTO_METRIC_SET=RasterizerAndPixelBackend ./build/src/tool/pps/pps-producer

Vulkan applications can also be instrumented to be Perfetto producers. To enable this for given application, set the environment variable as follow :

PERFETTO_TRACE=1 my_vulkan_app

Panfrost

The Panfrost PPS driver uses unstable ioctls that behave correctly on kernel version 5.4.23+ and 5.5.7+.

To run the producer, follow these two simple steps:

  1. Enable Panfrost unstable ioctls via kernel parameter:

    modprobe panfrost unstable_ioctls=1
    

    Alternatively you could add panfrost.unstable_ioctls=1 to your kernel command line, or echo 1 > /sys/module/panfrost/parameters/unstable_ioctls.

  2. Run the producer:

    ./build/pps-producer
    

Troubleshooting

Tmux

If the convenience script tools/tmux keeps copying artifacts to your SSH_TARGET without starting the tmux session, make sure you have tmux installed in your system.

apt install tmux

Missing counter names

If the trace viewer shows a list of counters with a description like gpu_counter(#) instead of their proper names, maybe you had a data loss due to the trace buffer being full and wrapped.

In order to prevent this loss of data you can tweak the trace config file in two different ways:

  • Increase the size of the buffer in use:

    buffers {
        size_kb: 2048,
        fill_policy: RING_BUFFER,
    }
    
  • Periodically flush the trace buffer into the output file:

    write_into_file: true
    file_write_period_ms: 250
    
  • Discard new traces when the buffer fills:

    buffers {
        size_kb: 2048,
        fill_policy: DISCARD,
    }