lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANDhNCp2J+dWKzXg8fX_3Cnx1WHFLag_t9J62-rFtcQYHbrcwA@mail.gmail.com>
Date: Wed, 23 Apr 2025 14:41:42 -0700
From: John Stultz <jstultz@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...hat.com>, 
	Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Steven Rostedt <rostedt@...dmis.org>, 
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>, 
	Valentin Schneider <vschneid@...hat.com>, K Prateek Nayak <kprateek.nayak@....com>, kernel-team@...roid.com, 
	Frederic Weisbecker <fweisbec@...il.com>
Subject: Re: [RFC][PATCH] sched/core: Tweak wait_task_inactive() to force
 dequeue sched_delayed tasks

On Tue, Apr 22, 2025 at 1:56 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Mon, Apr 21, 2025 at 09:43:45PM -0700, John Stultz wrote:
> > It was reported that in 6.12, smpboot_create_threads() was
> > taking much longer then in 6.6.
> >
> > I narrowed down the call path to:
> >  smpboot_create_threads()
> >  -> kthread_create_on_cpu()
> >     -> kthread_bind()
> >        -> __kthread_bind_mask()
> >           ->wait_task_inactive()
> >
> > Where in wait_task_inactive() we were regularly hitting the
> > queued case, which sets a 1 tick timeout, which when called
> > multiple times in a row, accumulates quickly into a long
> > delay.
>
> Argh, this is all stupid :-(
>
> The whole __kthread_bind_*() thing is a bit weird, but fundamentally it
> tries to avoid a race vs current. Notably task_state::flags is only ever
> modified by current, except here.
>
> delayed_dequeue is fine, except wait_task_inactive() hasn't been
> told about it (I hate that function, murder death kill etc.).

Hey Peter,
  So I hear your (sanguinary?) dissatisfaction with these functions,
but from your short discussion with Frederic it seems like
wait_task_inactive() and kthread_bind() are sticking around, so I'm
not sure what the next course of action should be.

Should I resend my patch (including Prateek's suggested tweaks to make
it a little nicer), so wait_task_inactive() is sched_delayed aware? Or
are you thinking we should solve this differently (ideally in some
form that can head to -stable to help folks hitting this in 6.12?)?

thanks
-john

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ