linux-kernel - [ANNOUNCE] Rust libtraceevent/trace-cmd report

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [day] [month] [year] [list]

Message-ID: <e5149343-5a0f-4947-97ec-a61b741fadad@arm.com>
Date: Fri, 20 Sep 2024 10:43:31 +0100
From: Douglas Raillard <douglas.raillard@....com>
To: Linux Trace Devel <linux-trace-devel@...r.kernel.org>,
 linux-kernel@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>
Subject: [ANNOUNCE] Rust libtraceevent/trace-cmd report

As promised during 2022 tracing summit to Steven Rostedt and after a few months
of dogfooding, we are pleased to announce the Rust libtraceevent and trace-cmd
report counterpart, plus parquet conversion.

https://gitlab.arm.com/tooling/lisa/-/tree/main/tools/trace-parser

git clone https://git.gitlab.arm.com/tooling/lisa.git --depth=1

trace-tools
===========

CLI utility akin to trace-cmd report. Bear in mind that this is intended to be
an internal utility for LISA's Trace() Python API. As such, the CLI itself is
unstable (but that could change if people express interest).

You can test it very easily on your trace using the precompiled binary shipped in the repo:

# Human readable dump like trace-cmd report. The output should be almost
# identical.
tools/x86_64/trace-dump human-readable --trace-format tracedat --trace doc/traces/trace.dat

# Convert the trace into a bunch of parquet files, one per event. This is used
# by LISA internally to expose the trace as a set of polars/pandas dataframes.
# You could use that with e.g. duckdb and SQL for a no-Python experience (a bit
# like perfetto trace processor but on trace.dat, faster and more memory
# efficient).
tools/x86_64/trace-dump parquet --trace-format tracedat --trace doc/traces/trace.dat

# Show the content of the header and check that event formats can be parsed
# successfully.
tools/x86_64/trace-dump check-header --trace-format tracedat --trace doc/traces/trace.dat

Notable aspects:

- Handles traces larger than memory, including in parquet conversion which
   streams.

- Significantly faster than trace-cmd report. Both were compiled as static
   binaries with musl libc which has a notoriously slow malloc() though, which
   may be a bit unfair to trace-cmd.

traceevent
==========

This is the library crate that implements something akin to C libtraceevent,
but also includes I/O and trace.dat parsing code. It supports:

- trace.dat v6

- trace.dat v7

- (almost?) all event formats that _can_ be supported

- Extension event format macro/functions parser and interpreter. This is not
   currently exposed publicly but it would be easy. This can provide support for
   any C function/macro the lib would not handle, including polymorphic ones
   like __builtin_choose_expr()

- Stream processing, the trace is never fully loaded in memory.

- As zero-copy and lazy as feasible. Events can be skipped based on just the ID
   without  any field decoding etc.

- Supports both in-memory and file-based data input.

- Guest/host VM features (e.g. timestamp synchro option) should be implemented
  but were never tested.

Future work
===========

- The naming of both the library and the tool are open to changes.

- Both crates really should land on crates.io one day, I need to sort that out.
   This should only be done after names are settled though.

- Lib crate API is not stable yet, consider this a 0.* version.

- Lib crate could grow public APIs to parse buffers, so it can be used for
   other formats than trace.dat like the C libtraceevent.

- Compile time: the C parser is using nom. This is _very_ closure-heavy code,
   which makes LLVM codegen take ages. I'm not sure how to fix that without
   changing the parser. I don't want to write a C parser ever again.

- Feedback welcome, the future is what we want it to be !

-- Douglas