linux-kernel - Re: [PATCH v2 00/11] unwind, perf: sframe user space unwinding, deferred perf callchains

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240916184645.142face1@rorschach.local.home>
Date: Mon, 16 Sep 2024 18:46:45 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Josh Poimboeuf <jpoimboe@...nel.org>, x86@...nel.org, Ingo Molnar
 <mingo@...nel.org>, Arnaldo Carvalho de Melo <acme@...nel.org>,
 linux-kernel@...r.kernel.org, Indu Bhagat <indu.bhagat@...cle.com>, Mark
 Rutland <mark.rutland@....com>, Alexander Shishkin
 <alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>, Ian Rogers <irogers@...gle.com>, Adrian
 Hunter <adrian.hunter@...el.com>, linux-perf-users@...r.kernel.org, Mark
 Brown <broonie@...nel.org>, linux-toolchains@...r.kernel.org, Jordan Rome
 <jordalgo@...a.com>, Sam James <sam@...too.org>, Mathieu Desnoyers
 <mathieu.desnoyers@...icios.com>
Subject: Re: [PATCH v2 00/11] unwind, perf: sframe user space unwinding,
 deferred perf callchains

On Mon, 16 Sep 2024 20:15:45 +0200
Peter Zijlstra <peterz@...radead.org> wrote:

> On Mon, Sep 16, 2024 at 05:39:53PM +0200, Josh Poimboeuf wrote:
> 
> > The cookie is incremented once per entry from userspace, when needed.
> > 
> > It's a unique id used by the tracers which is emitted with both kernel
> > and user stacktrace samples so the userspace tool can stitch them back
> > together.  Even if you have multiple tracers triggering at the same time
> > they can all share a single user trace.  
> 
> But perf don't need this at all, right? It knows that the next deferred
> trace for that tid will be the one.

Is that because perf uses per task buffers? Will perf know this if it
uses a global buffer? What does perf do with "-a"?

> 
> > > This scheme seems unsound  
> 
> > > pin yourself on CPU0 and trigger 1<<48 unwinds while keeping CPU1 idle.  
> > 
> > Hm???  
> 
> The point being that it is possible to wrap one CPU into the id space of
> another CPU. It is not trivial, but someone who wants to can make it
> happen.
> 
> Combined I don't see the need to force this into this scheme, carry an
> opaque (void*) pointer and let the user do whatever it wants, perf for
> instance can pass NULL and not do anything like this.

That would put a huge burden on those tracers. Specifically ftrace.
Just because perf doesn't need it, it doesn't mean it can't be part of
the infrastructure. The cookie isn't even per-cpu. The counter is just
generated per cpu to avoid races. The cookie will have the CPU as part
of the bits, plus the per-cpu counter. This will give us a unique value
across all CPUs without having to do any atomic operations. It's just
that the cookie must be unique for every time a task enters the kernel,
and stay the same while it is in the kernel.

So if the task enters the kernel on CPU 1 and a trace is requested, the
tracer still needs to save that event when the request occurred (for
example, the kernel stack trace). It needs a unique identifier that
will be attached to the user stack trace when it is made, so that post
processing can tell where to append that stack.

Then the task can migrate to several other CPUs, but it will still have
the same cookie as when it received it on the first CPU. The cookie
only gets updated on the task when it goes back to user space.

ftrace allows for several instances that could be filtering on the same
or different tasks. One instance could filter 3 tasks, another 2 tasks,
where one of those tasks is in both instance. The user stack
infrastructure should give a way to easily map the request to the stack
via some kind of identifier.

What would perf do on dropped events? Say a system call happened, perf
requested a user stack trace, then the events were dropped, and when it
started again, it was in a system call, how would perf know it is in
the same system call? I guess the callback when the dropped user stack
trace occurred would tell perf to reset. Still it adds a pretty big
burden for the tracers to handle these corner cases whereas a simple
identifier is much more simple and robust.

-- Steve