lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250918173220.GA3475922@noisy.programming.kicks-ass.net>
Date: Thu, 18 Sep 2025 19:32:20 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Steven Rostedt <rostedt@...nel.org>, linux-kernel@...r.kernel.org,
	linux-trace-kernel@...r.kernel.org, bpf@...r.kernel.org,
	x86@...nel.org, Masami Hiramatsu <mhiramat@...nel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Ingo Molnar <mingo@...nel.org>, Jiri Olsa <jolsa@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Namhyung Kim <namhyung@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Andrii Nakryiko <andrii@...nel.org>,
	Indu Bhagat <indu.bhagat@...cle.com>,
	"Jose E. Marchesi" <jemarch@....org>,
	Beau Belgrave <beaub@...ux.microsoft.com>,
	Jens Remus <jremus@...ux.ibm.com>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Florian Weimer <fweimer@...hat.com>, Sam James <sam@...too.org>,
	Kees Cook <kees@...nel.org>, Carlos O'Donell <codonell@...hat.com>
Subject: Re: [RESEND][PATCH v15 0/4] perf: Support the deferred unwinding
 infrastructure

On Thu, Sep 18, 2025 at 07:24:14PM +0200, Peter Zijlstra wrote:

> So we have:
> 
> do_syscall_64()
>   ... do stuff ...
>   syscall_exit_to_user_mode(regs)
>     syscall_exit_to_user_mode_work(regs)
>       syscall_exit_work()
>       exit_to_user_mode_prepare()
>         exit_to_user_mode_loop()
> 	  retume_user_mode_work()
> 	    task_work_run()
>     exit_to_user_mode()
>       unwind_reset_info();
>       user_enter_irqoff();
>       arch_exit_to_user_mode();
>       lockdep_hardirqs_on();
>   SYSRET/IRET
> 
> 
> and
> 
> DEFINE_IDTENTRY*()
>   irqentry_enter();
>   ... stuff ...
>   irqentry_exit()
>     irqentry_exit_to_user_mode()
>       exit_to_user_mode_prepare()
>         exit_to_user_mode_loop();
> 	  retume_user_mode_work()
> 	    task_work_run()
>       exit_to_user_mode()
>         unwind_reset_info();
> 	...
>   IRET
> 
> Now, task_work_run() is in the exit_to_user_mode_loop() which is notably
> *before* exit_to_user_mode() which does the unwind_reset_info().
> 
> What happens if we get an NMI requesting an unwind after
> unwind_reset_info() while still very much being in the kernel on the way
> out?

AFAICT it will try and do a task_work_add(TWA_RESUME) from NMI context,
and this will fail horribly.

If you do something like:

	twa_mode = in_nmi() ? TWA_NMI_CURRENT : TWA_RESUME;
	task_work_add(foo, twa_mode);

it might actually work.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ