lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250123221326.GD969@noisy.programming.kicks-ass.net>
Date: Thu, 23 Jan 2025 23:13:26 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Josh Poimboeuf <jpoimboe@...nel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, x86@...nel.org,
	Steven Rostedt <rostedt@...dmis.org>,
	Ingo Molnar <mingo@...nel.org>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	linux-kernel@...r.kernel.org, Indu Bhagat <indu.bhagat@...cle.com>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>, Namhyung Kim <namhyung@...nel.org>,
	Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	linux-perf-users@...r.kernel.org, Mark Brown <broonie@...nel.org>,
	linux-toolchains@...r.kernel.org, Jordan Rome <jordalgo@...a.com>,
	Sam James <sam@...too.org>, linux-trace-kernel@...r.kernel.org,
	Andrii Nakryiko <andrii.nakryiko@...il.com>,
	Jens Remus <jremus@...ux.ibm.com>,
	Florian Weimer <fweimer@...hat.com>,
	Andy Lutomirski <luto@...nel.org>,
	Masami Hiramatsu <mhiramat@...nel.org>,
	Weinan Liu <wnliu@...gle.com>
Subject: Re: [PATCH v4 28/39] unwind_user/deferred: Add deferred unwinding
 interface

On Thu, Jan 23, 2025 at 10:43:05AM -0800, Josh Poimboeuf wrote:
> On Thu, Jan 23, 2025 at 09:25:34AM +0100, Peter Zijlstra wrote:
> > On Wed, Jan 22, 2025 at 08:05:33PM -0800, Josh Poimboeuf wrote:
> > 
> > > However... would it be a horrible idea for 'next' to unwind 'prev' after
> > > the context switch???
> > 
> > The idea isn't terrible, but it will be all sorta of tricky.
> > 
> > The big immediate problem is that the CPU doing the context switch
> > looses control over prev at:
> > 
> >   __schedule()
> >     context_switch()
> >       finish_task_switch()
> >         finish_task()
> > 	  smp_store_release(&prev->on_cpu, 0);
> > 
> > And this is before we drop rq->lock.
> > 
> > The instruction after that store another CPU is free to claim the task
> > and run with it. Notably, another CPU might already be spin waiting on
> > that state, trying to wake the task back up.
> > 
> > By the time we get to a schedulable context, @prev is completely out of
> > bounds.
> 
> Could unwind_deferred_request() call migrate_disable() or so?

That's pretty vile... and might cause performance issues. You realy
don't want things to magically start behaving differently just because
you're tracing.

> How bad would it be to set some bit in @prev to prevent it from getting
> rescheduled until the unwind from @next has been done?  Unfortunately
> two tasks would be blocked on the unwind instead of one.

Yeah, not going to happen. Those paths are complicated enough as is.

> BTW, this might be useful for another reason.  In Steve's sframe meeting
> yesterday there was some talk of BPF needing to unwind from
> sched-switch, without having to wait indefinitely for @prev to get
> rescheduled and return to user.

-EPONIES, you cannot take faults from the middle of schedule(). They can
always use the best effort FP unwind we have today.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ