[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241029182032.GI14555@noisy.programming.kicks-ass.net>
Date: Tue, 29 Oct 2024 19:20:32 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: Josh Poimboeuf <jpoimboe@...nel.org>, x86@...nel.org,
Steven Rostedt <rostedt@...dmis.org>,
Ingo Molnar <mingo@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
linux-kernel@...r.kernel.org, Indu Bhagat <indu.bhagat@...cle.com>,
Mark Rutland <mark.rutland@....com>,
Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
Ian Rogers <irogers@...gle.com>,
Adrian Hunter <adrian.hunter@...el.com>,
linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>,
Sam James <sam@...too.org>, linux-trace-kernel@...r.kerne.org,
Andrii Nakryiko <andrii.nakryiko@...il.com>,
Jens Remus <jremus@...ux.ibm.com>,
Florian Weimer <fweimer@...hat.com>,
Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH v3 11/19] unwind: Add deferred user space unwinding API
On Tue, Oct 29, 2024 at 01:47:59PM -0400, Mathieu Desnoyers wrote:
> On 2024-10-29 13:17, Josh Poimboeuf wrote:
> > On Tue, Oct 29, 2024 at 02:56:17PM +0100, Peter Zijlstra wrote:
> > > On Mon, Oct 28, 2024 at 02:47:38PM -0700, Josh Poimboeuf wrote:
> > >
> > > > + * The only exception is when the task has migrated to another CPU, *and* this
> > > > + * is called while the task work is running (or has already run). Then a new
> > > > + * cookie will be generated and the callback will be called again for the new
> > > > + * cookie.
> > >
> > > So that's a bit crap. The user stack won't change for having been
> > > migrated.
> > >
> > > So perf can readily use the full u64 cookie value as a sequence number,
> > > since the whole perf record will already have the TID of the task in.
> > > Mixing in this CPU number for no good reason and causing trouble like
> > > this just doesn't make sense to me.
> > >
> > > If ftrace needs brain damage like this, can't we push this to the user?
> > >
> > > That is, do away with the per-cpu sequence crap, and add a per-task
> > > counter that is incremented for every return-to-userspace.
> >
> > That would definitely make things easier for me, though IIRC Steven and
> > Mathieu had some concerns about TID wrapping over days/months/years.
> >
> > With that mindset I suppose the per-CPU counter could also wrap, though
> > that could be mitigated by making the cookie a struct with more bits.
> >
>
> AFAIR, the scheme we discussed in Prague was different than the
> implementation here.
>
> We discussed having a free-running counter per-cpu, and combining it
> with the cpu number as top (or low) bits, to effectively make a 64-bit
> value that is unique across the entire system, but without requiring a
> global counter with its associated cache line bouncing.
>
> Here is part where the implementation here differs from our discussion:
> I recall we discussed keeping a snapshot of the counter value within
> the task struct of the thread. So we only snapshot the per-cpu value
> on first use after entering the kernel, and after that we use the same
> per-cpu value snapshot (from task struct) up until return to userspace.
> We clear the task struct cookie snapshot on return to userspace.
>
> This way, even if the thread is migrated during the system call, the
> cookie value does not change: it simply depends on the point where it
> was first snapshotted (either before or after migration). From that
> point until return to userspace, we just use the per-thread snapshot
> value.
>
> This should allow us to keep a global cookie semantic (no need to
> tie this to tracer-specific knowledge about current TID), without the
> migration corner cases discussed in the comment above.
The 48:16 bit split gives you uniqueness for around 78 hours at 1GHz.
But seriously, perf doesn't need this. It really only needs a sequence
number if you care to stitch over a LOST packet (and I can't say I care
about that case much) -- and doing that right doesn't really take much
at all.
Powered by blists - more mailing lists