[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250918172414.GC3409427@noisy.programming.kicks-ass.net>
Date: Thu, 18 Sep 2025 19:24:14 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Steven Rostedt <rostedt@...nel.org>, linux-kernel@...r.kernel.org,
linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org,
x86@...nel.org, Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Josh Poimboeuf <jpoimboe@...nel.org>,
Ingo Molnar <mingo@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
Arnaldo Carvalho de Melo <acme@...nel.org>,
Namhyung Kim <namhyung@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Andrii Nakryiko <andrii@...nel.org>,
Indu Bhagat <indu.bhagat@...cle.com>,
"Jose E. Marchesi" <jemarch@....org>,
Beau Belgrave <beaub@...ux.microsoft.com>,
Jens Remus <jremus@...ux.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Florian Weimer <fweimer@...hat.com>, Sam James <sam@...too.org>,
Kees Cook <kees@...nel.org>, Carlos O'Donell <codonell@...hat.com>
Subject: Re: [RESEND][PATCH v15 0/4] perf: Support the deferred unwinding
infrastructure
On Thu, Sep 18, 2025 at 11:18:53AM -0400, Steven Rostedt wrote:
> On Thu, 18 Sep 2025 13:46:10 +0200
> Peter Zijlstra <peterz@...radead.org> wrote:
>
> > So I started looking at this, but given I never seen the deferred unwind
> > bits that got merged I have to look at that first.
> >
> > Headers want something like so.. Let me read the rest.
> >
> > ---
> > include/linux/unwind_deferred.h | 38 +++++++++++++++++++----------------
> > include/linux/unwind_deferred_types.h | 2 ++
> > 2 files changed, 23 insertions(+), 17 deletions(-)
>
> Would you like to send a formal patch with this? I'd actually break it into
> two patches. One to clean up the long lines, and the other to change the
> logic.
Sure, I'll collect the lot while I go through it and whip something up
when I'm done. For now, I'll just shoot a few questions your way.
So we have:
do_syscall_64()
... do stuff ...
syscall_exit_to_user_mode(regs)
syscall_exit_to_user_mode_work(regs)
syscall_exit_work()
exit_to_user_mode_prepare()
exit_to_user_mode_loop()
retume_user_mode_work()
task_work_run()
exit_to_user_mode()
unwind_reset_info();
user_enter_irqoff();
arch_exit_to_user_mode();
lockdep_hardirqs_on();
SYSRET/IRET
and
DEFINE_IDTENTRY*()
irqentry_enter();
... stuff ...
irqentry_exit()
irqentry_exit_to_user_mode()
exit_to_user_mode_prepare()
exit_to_user_mode_loop();
retume_user_mode_work()
task_work_run()
exit_to_user_mode()
unwind_reset_info();
...
IRET
Now, task_work_run() is in the exit_to_user_mode_loop() which is notably
*before* exit_to_user_mode() which does the unwind_reset_info().
What happens if we get an NMI requesting an unwind after
unwind_reset_info() while still very much being in the kernel on the way
out?
What is the purpose of unwind_deferred_task_exit()? This is called from
do_exit(), only slightly before it does exit_task_work(), which runs all
pending task_work. Is there something that justifies the manual run and
cancel instead of just leaving it sit in task_work an having it run
naturally? If so, that most certainly deserves a comment.
A similar question for unwind_task_free(), where exactly is it relevant?
Where does it acquire a task_work that is not otherwise already ran on
exit?
Powered by blists - more mailing lists