linux-kernel - Re: [PATCH v3 11/19] unwind: Add deferred user space unwinding API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <bcd11a07-45fb-442b-a25b-5cadc6aac0e6@efficios.com>
Date: Tue, 29 Oct 2024 13:47:59 -0400
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: Josh Poimboeuf <jpoimboe@...nel.org>,
 Peter Zijlstra <peterz@...radead.org>
Cc: x86@...nel.org, Steven Rostedt <rostedt@...dmis.org>,
 Ingo Molnar <mingo@...nel.org>, Arnaldo Carvalho de Melo <acme@...nel.org>,
 linux-kernel@...r.kernel.org, Indu Bhagat <indu.bhagat@...cle.com>,
 Mark Rutland <mark.rutland@....com>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
 Ian Rogers <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
 linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
 linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>,
 Sam James <sam@...too.org>, linux-trace-kernel@...r.kerne.org,
 Andrii Nakryiko <andrii.nakryiko@...il.com>,
 Jens Remus <jremus@...ux.ibm.com>, Florian Weimer <fweimer@...hat.com>,
 Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH v3 11/19] unwind: Add deferred user space unwinding API

On 2024-10-29 13:17, Josh Poimboeuf wrote:
> On Tue, Oct 29, 2024 at 02:56:17PM +0100, Peter Zijlstra wrote:
>> On Mon, Oct 28, 2024 at 02:47:38PM -0700, Josh Poimboeuf wrote:
>>
>>> + * The only exception is when the task has migrated to another CPU, *and* this
>>> + * is called while the task work is running (or has already run).  Then a new
>>> + * cookie will be generated and the callback will be called again for the new
>>> + * cookie.
>>
>> So that's a bit crap. The user stack won't change for having been
>> migrated.
>>
>> So perf can readily use the full u64 cookie value as a sequence number,
>> since the whole perf record will already have the TID of the task in.
>> Mixing in this CPU number for no good reason and causing trouble like
>> this just doesn't make sense to me.
>>
>> If ftrace needs brain damage like this, can't we push this to the user?
>>
>> That is, do away with the per-cpu sequence crap, and add a per-task
>> counter that is incremented for every return-to-userspace.
> 
> That would definitely make things easier for me, though IIRC Steven and
> Mathieu had some concerns about TID wrapping over days/months/years.
> 
> With that mindset I suppose the per-CPU counter could also wrap, though
> that could be mitigated by making the cookie a struct with more bits.
> 

AFAIR, the scheme we discussed in Prague was different than the
implementation here.

We discussed having a free-running counter per-cpu, and combining it
with the cpu number as top (or low) bits, to effectively make a 64-bit
value that is unique across the entire system, but without requiring a
global counter with its associated cache line bouncing.

Here is part where the implementation here differs from our discussion:
I recall we discussed keeping a snapshot of the counter value within
the task struct of the thread. So we only snapshot the per-cpu value
on first use after entering the kernel, and after that we use the same
per-cpu value snapshot (from task struct) up until return to userspace.
We clear the task struct cookie snapshot on return to userspace.

This way, even if the thread is migrated during the system call, the
cookie value does not change: it simply depends on the point where it
was first snapshotted (either before or after migration). From that
point until return to userspace, we just use the per-thread snapshot
value.

This should allow us to keep a global cookie semantic (no need to
tie this to tracer-specific knowledge about current TID), without the
migration corner cases discussed in the comment above.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com