linux-kernel - Re: [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete sts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250624153125.56eab22a@batman.local.home>
Date: Tue, 24 Jun 2025 15:31:25 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Nam Cao <namcao@...utronix.de>
Cc: Gabriele Monaco <gmonaco@...hat.com>, linux-kernel@...r.kernel.org,
 Jonathan Corbet <corbet@....net>, Masami Hiramatsu <mhiramat@...nel.org>,
 linux-trace-kernel@...r.kernel.org, linux-doc@...r.kernel.org, Ingo Molnar
 <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, Tomas Glozar
 <tglozar@...hat.com>, Juri Lelli <jlelli@...hat.com>,
 john.ogness@...utronix.de
Subject: Re: [RFC PATCH v2 09/12] rv: Replace tss monitor with more complete
 sts

On Tue, 24 Jun 2025 17:50:53 +0200
Nam Cao <namcao@...utronix.de> wrote:

> I would like that. Ideally, the userspace tools only use tracepoints based
> on available_monitors.
> 
> However, people may not do that, and just use tracepoints directly.
> 
> You could argue that those tools are not correctly designed. Therefore it
> is their fault that the tools are broken after updating kernel.
> 
> On the other hand, there is this sentiment that we must never break
> userspace.
> 
> I don't know enough to judge this. Maybe @Steven has something to add?

So WRT tracepoints, it's the same as a tree falling in the woods[1].

If a user space ABI "breaks" but no user space tooling notices, did it
really break?

The answer is "No".

As for tracepoints, its fine to change them until it's not ;-)

We had only one case that a tracepoint change broke user space where
Linus reverted that change [2]. That was because powertop hard coded
the addresses to the tracepoint field offsets and didn't use the format
files (what libtraceevent gives you). And I removed an unused common
field, which shifted everything and broke powertop.

But I converted powertop to use libtraceevent, waited a few year until
all the major distros provided the new powertop, and then I removed the
field. Guess what? Nobody noticed. (And old powertop would still break).

BPF taps into most tracepoints that change all the time. I'm cleaning
up unused tracepoints which included a couple that were left around to
"not break old BPF programs". I replied, if an old BPF program relies on
that tracepoint, keeping it around (but not used) is *worse* than
having that BPF program break. That's because that BPF program is
still broken (it's expecting that unused tracepoint to fire) but now
it's getting garbage for output (that being no output!). It's worse
because it's "silently failing" and the user may be relying on
something they don't know is broken.

So yeah, change the tracepoint when the code its tracing changes. That
way any tooling depending on it, knows that it can no longer depend on
it.

Anything using tracepoints are pretty much tied to the kernel anyway,
and when the kernel updates, the tooling that is relying on it also
needs to be updated otherwise it's not getting the information it is
expecting. That most definitely includes monitors.

-- Steve

[1] https://en.wikipedia.org/wiki/If_a_tree_falls_in_a_forest_and_no_one_is_around_to_hear_it,_does_it_make_a_sound%3F
[2] https://lwn.net/Articles/442113/