linux-kernel - Re: [PATCH v5 3/9] unwind deferred: Use bitmask to determine which callbacks to call

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250428141210.6ef945c6@gandalf.local.home>
Date: Mon, 28 Apr 2025 14:12:10 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Cc: linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org, Masami
 Hiramatsu <mhiramat@...nel.org>, Mark Rutland <mark.rutland@....com>,
 Andrew Morton <akpm@...ux-foundation.org>, Josh Poimboeuf
 <jpoimboe@...nel.org>, x86@...nel.org, Peter Zijlstra
 <peterz@...radead.org>, Ingo Molnar <mingo@...nel.org>, Arnaldo Carvalho de
 Melo <acme@...nel.org>, Indu Bhagat <indu.bhagat@...cle.com>, Alexander
 Shishkin <alexander.shishkin@...ux.intel.com>, Jiri Olsa
 <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>, Ian Rogers
 <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>,
 linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
 linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>, Sam
 James <sam@...too.org>, Andrii Nakryiko <andrii.nakryiko@...il.com>, Jens
 Remus <jremus@...ux.ibm.com>, Florian Weimer <fweimer@...hat.com>, Andy
 Lutomirski <luto@...nel.org>, Weinan Liu <wnliu@...gle.com>, Blake Jones
 <blakejones@...gle.com>, Beau Belgrave <beaub@...ux.microsoft.com>, "Jose
 E. Marchesi" <jemarch@....org>, Alexander Aring <aahringo@...hat.com>
Subject: Re: [PATCH v5 3/9] unwind deferred: Use bitmask to determine which
 callbacks to call

On Mon, 28 Apr 2025 14:00:07 -0400
Mathieu Desnoyers <mathieu.desnoyers@...icios.com> wrote:

> >>
> >> It is enough to guard with RCU ? See syscall_regfunc() from
> >> tracepoint.c where we do:
> >>
> >>                   read_lock(&tasklist_lock);
> >>                   for_each_process_thread(p, t) {
> >>                           set_task_syscall_work(t, SYSCALL_TRACEPOINT);
> >>                   }
> >>                   read_unlock(&tasklist_lock);
> >>
> >> To prevent concurrent fork from adding threads while we
> >> iterate, thus opening the possibility of missing a clear
> >> due to a concurrent fork + set bit.  
> > 
> > A set_bit only would happen if the callback was live and accepting new
> > callback requests. It's a bug for a tracer to call unwind_deferred_cancel()
> > and then call unwind_deferred_request() (which would set the bit). We could
> > possibly set the tracer's unwind descriptor id to -1, and do an
> > WARN_ON_ONCE() in unwind_deferred_request() if the tracer's id is negative.
> > 
> > The loop is called under the callback_mutex, where no new tracer could
> > register and be assigned that bit.  
> 
> Ah, that's the piece I missed. The callback_mutex prevents reallocation
> of the ID by unwind_deferred_init while iterating on the tasks.
> 
> One more comment: if we change the linked list for an array (or make the 
> linked list an RCU list), can we remove the callback_mutex from
> unwind_deferred_task_work by turning it into an RCU read-side ?
> 
> Then we just need to wait for a grace period before returning from
> unwind_deferred_cancel, which then allows the caller to reclaim "work".
> 
> Taking the callback_mutex in unwind_deferred_task_work will end up being
> the single thing that does a lot of cache line bouncing across CPUs when
> hit heavily by tracers.

I'm not against this, but again, that's an optimization. I want to keep the
initial code simple. And then add the more complex optimizations when this
is stable.

-- Steve