lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250611093934.GB2273038@noisy.programming.kicks-ass.net>
Date: Wed, 11 Jun 2025 11:39:34 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: mingo@...hat.com, juri.lelli@...hat.com, dietmar.eggemann@....com,
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
	vschneid@...hat.com, clm@...a.com, linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH 5/5] sched: Add ttwu_queue support for delayed tasks

On Fri, Jun 06, 2025 at 06:55:37PM +0200, Vincent Guittot wrote:
> > > > @@ -3830,12 +3859,41 @@ void sched_ttwu_pending(void *arg)
> > > >         update_rq_clock(rq);
> > > >
> > > >         llist_for_each_entry_safe(p, t, llist, wake_entry.llist) {
> > > > +               struct rq *p_rq = task_rq(p);
> > > > +               int ret;
> > > > +
> > > > +               /*
> > > > +                * This is the ttwu_runnable() case. Notably it is possible for
> > > > +                * on-rq entities to get migrated -- even sched_delayed ones.
> > >
> > > I haven't found where the sched_delayed task could migrate on another cpu.
> >
> > Doesn't happen often, but it can happen. Nothing really stops it from
> > happening. Eg weight based balancing can do it. As can numa balancing
> > and affinity changes.
> 
> Yes, I agree that delayed tasks can migrate because of load balancing
> but not at wake up.

Right, but this here is the case where wakeup races with load-balancing.
Specifically, due to the wake_list, the wakeup can happen while the task
is on CPU N, and by the time the IPI gets processed the task has moved
to CPU M.

It doesn't happen often, but it was 'fun' chasing that fail around for a
day :/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ