lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250123184305.rjuxj7hs3ond3e7c@jpoimboe>
Date: Thu, 23 Jan 2025 10:43:05 -0800
From: Josh Poimboeuf <jpoimboe@...nel.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, x86@...nel.org,
	Steven Rostedt <rostedt@...dmis.org>,
	Ingo Molnar <mingo@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	linux-kernel@...r.kernel.org, Indu Bhagat <indu.bhagat@...cle.com>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
	linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>,
	Sam James <sam@...too.org>, linux-trace-kernel@...r.kernel.org,
	Andrii Nakryiko <andrii.nakryiko@...il.com>,
	Jens Remus <jremus@...ux.ibm.com>,
	Florian Weimer <fweimer@...hat.com>,
	Andy Lutomirski <luto@...nel.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Weinan Liu <wnliu@...gle.com>
Subject: Re: [PATCH v4 28/39] unwind_user/deferred: Add deferred unwinding
 interface

On Thu, Jan 23, 2025 at 09:25:34AM +0100, Peter Zijlstra wrote:
> On Wed, Jan 22, 2025 at 08:05:33PM -0800, Josh Poimboeuf wrote:
> 
> > However... would it be a horrible idea for 'next' to unwind 'prev' after
> > the context switch???
> 
> The idea isn't terrible, but it will be all sorta of tricky.
> 
> The big immediate problem is that the CPU doing the context switch
> looses control over prev at:
> 
>   __schedule()
>     context_switch()
>       finish_task_switch()
>         finish_task()
> 	  smp_store_release(&prev->on_cpu, 0);
> 
> And this is before we drop rq->lock.
> 
> The instruction after that store another CPU is free to claim the task
> and run with it. Notably, another CPU might already be spin waiting on
> that state, trying to wake the task back up.
> 
> By the time we get to a schedulable context, @prev is completely out of
> bounds.

Could unwind_deferred_request() call migrate_disable() or so?

How bad would it be to set some bit in @prev to prevent it from getting
rescheduled until the unwind from @next has been done?  Unfortunately
two tasks would be blocked on the unwind instead of one.

BTW, this might be useful for another reason.  In Steve's sframe meeting
yesterday there was some talk of BPF needing to unwind from
sched-switch, without having to wait indefinitely for @prev to get
rescheduled and return to user.

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ