linux-kernel - Re: [PATCH v4 28/39] unwind_user/deferred: Add deferred unwinding interface

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20250124224645.lcovfraeq53gegys@jpoimboe>
Date: Fri, 24 Jan 2025 14:46:45 -0800
From: Josh Poimboeuf <jpoimboe@...nel.org>
To: Steven Rostedt <rostedt@...dmis.org>
Cc: Peter Zijlstra <peterz@...radead.org>,
	Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, x86@...nel.org,
	Ingo Molnar <mingo@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	linux-kernel@...r.kernel.org, Indu Bhagat <indu.bhagat@...cle.com>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
	linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>,
	Sam James <sam@...too.org>, linux-trace-kernel@...r.kernel.org,
	Andrii Nakryiko <andrii.nakryiko@...il.com>,
	Jens Remus <jremus@...ux.ibm.com>,
	Florian Weimer <fweimer@...hat.com>,
	Andy Lutomirski <luto@...nel.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Weinan Liu <wnliu@...gle.com>
Subject: Re: [PATCH v4 28/39] unwind_user/deferred: Add deferred unwinding
 interface

On Fri, Jan 24, 2025 at 04:58:03PM -0500, Steven Rostedt wrote:
> On Thu, 23 Jan 2025 23:13:26 +0100
> Peter Zijlstra <peterz@...radead.org> wrote:
> 
> > -EPONIES, you cannot take faults from the middle of schedule(). They can
> > always use the best effort FP unwind we have today.
> 
> Agreed.
> 
> Now the only thing I could think of is a flag gets set where the task comes
> out of the scheduler and then does the stack trace. It doesn't need to do
> the stack trace before it schedules. As it did just schedule, where ever it
> scheduled must have been in a schedulable context.
> 
> That is, kind of like the task_work flag for entering user space and
> exiting the kernel, could we have a sched_work flag to run after after being
> scheduled back (exiting schedule()). Since the task has been picked to run,
> it will not cause latency for other tasks. The work will be done in its
> context. This is no different to the tasks accounting than if it does this
> going back to user space. Heck, it would only need to do this once if it
> didn't go back to user space, as the user space stack would be the same.
> That is, if it gets scheduled multiple times, this would only happen on the
> first instance until it leaves the kernel.
> 
> 
> 	[ trigger stack trace - set sched_work ]
> 
> 	schedule() {
> 		context_switch() -> CPU runs some other task
> 				 <- gets scheduled back onto the CPU
> 		[..]
> 		/* preemption enabled ... */
> 		if (sched_work) {
> 			do stack trace() // can schedule here but
> 					 // calls a schedule function that does not
> 					 // do sched_work to prevent recursion
> 		}
> 	}
> 
> Could something like this work?

Yeah, this is basically a more fleshed out version of what I was trying
to propose.

One additional wrinkle is that if @prev wakes up on another CPU while
@next is unwinding it, the unwind goes haywire.  So that would maybe
need to be prevented.

-- 
Josh