[<prev] [next>] [day] [month] [year] [list]
Message-ID: <e5149343-5a0f-4947-97ec-a61b741fadad@arm.com>
Date: Fri, 20 Sep 2024 10:43:31 +0100
From: Douglas Raillard <douglas.raillard@....com>
To: Linux Trace Devel <linux-trace-devel@...r.kernel.org>,
linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>
Subject: [ANNOUNCE] Rust libtraceevent/trace-cmd report
As promised during 2022 tracing summit to Steven Rostedt and after a few months
of dogfooding, we are pleased to announce the Rust libtraceevent and trace-cmd
report counterpart, plus parquet conversion.
https://gitlab.arm.com/tooling/lisa/-/tree/main/tools/trace-parser
git clone https://git.gitlab.arm.com/tooling/lisa.git --depth=1
trace-tools
===========
CLI utility akin to trace-cmd report. Bear in mind that this is intended to be
an internal utility for LISA's Trace() Python API. As such, the CLI itself is
unstable (but that could change if people express interest).
You can test it very easily on your trace using the precompiled binary shipped in the repo:
# Human readable dump like trace-cmd report. The output should be almost
# identical.
tools/x86_64/trace-dump human-readable --trace-format tracedat --trace doc/traces/trace.dat
# Convert the trace into a bunch of parquet files, one per event. This is used
# by LISA internally to expose the trace as a set of polars/pandas dataframes.
# You could use that with e.g. duckdb and SQL for a no-Python experience (a bit
# like perfetto trace processor but on trace.dat, faster and more memory
# efficient).
tools/x86_64/trace-dump parquet --trace-format tracedat --trace doc/traces/trace.dat
# Show the content of the header and check that event formats can be parsed
# successfully.
tools/x86_64/trace-dump check-header --trace-format tracedat --trace doc/traces/trace.dat
Notable aspects:
- Handles traces larger than memory, including in parquet conversion which
streams.
- Significantly faster than trace-cmd report. Both were compiled as static
binaries with musl libc which has a notoriously slow malloc() though, which
may be a bit unfair to trace-cmd.
traceevent
==========
This is the library crate that implements something akin to C libtraceevent,
but also includes I/O and trace.dat parsing code. It supports:
- trace.dat v6
- trace.dat v7
- (almost?) all event formats that _can_ be supported
- Extension event format macro/functions parser and interpreter. This is not
currently exposed publicly but it would be easy. This can provide support for
any C function/macro the lib would not handle, including polymorphic ones
like __builtin_choose_expr()
- Stream processing, the trace is never fully loaded in memory.
- As zero-copy and lazy as feasible. Events can be skipped based on just the ID
without any field decoding etc.
- Supports both in-memory and file-based data input.
- Guest/host VM features (e.g. timestamp synchro option) should be implemented
but were never tested.
Future work
===========
- The naming of both the library and the tool are open to changes.
- Both crates really should land on crates.io one day, I need to sort that out.
This should only be done after names are settled though.
- Lib crate API is not stable yet, consider this a 0.* version.
- Lib crate could grow public APIs to parse buffers, so it can be used for
other formats than trace.dat like the C libtraceevent.
- Compile time: the C parser is using nom. This is _very_ closure-heavy code,
which makes LLVM codegen take ages. I'm not sure how to fix that without
changing the parser. I don't want to write a C parser ever again.
- Feedback welcome, the future is what we want it to be !
-- Douglas
Powered by blists - more mailing lists