[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YKe94oTVSbywMw2r@localhost.localdomain>
Date: Fri, 21 May 2021 16:04:18 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Quentin Perret <qperret@...gle.com>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>,
Will Deacon <will@...nel.org>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
linux-arm-kernel@...ts.infradead.org, linux-arch@...r.kernel.org,
linux-kernel@...r.kernel.org,
Catalin Marinas <catalin.marinas@....com>,
Marc Zyngier <maz@...nel.org>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Peter Zijlstra <peterz@...radead.org>,
Morten Rasmussen <morten.rasmussen@....com>,
Qais Yousef <qais.yousef@....com>,
Suren Baghdasaryan <surenb@...gle.com>,
Tejun Heo <tj@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Ingo Molnar <mingo@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
"Rafael J. Wysocki" <rjw@...ysocki.net>, kernel-team@...roid.com
Subject: Re: [PATCH v6 13/21] sched: Admit forcefully-affined tasks into
SCHED_DEADLINE
On 21/05/21 13:02, Quentin Perret wrote:
...
> So I think Will has a point since, IIRC, the root domains get rebuilt
> during hotplug. So you can imagine a case with a single root domain, but
> CPUs 4-7 are offline. In this case, sched_setattr() will happily promote
> a task to DL as long as its affinity mask is a superset of the rd span,
> but things may get ugly when CPUs are plugged back in later on.
>
> This looks like an existing bug though. I just tried the following on a
> system with 4 CPUs:
>
> // Create a task affined to CPU [0-2]
> > while true; do echo "Hi" > /dev/null; done &
> [1] 560
> > mypid=$!
> > taskset -p 7 $mypid
> pid 560's current affinity mask: f
> pid 560's new affinity mask: 7
>
> // Try to move it DL, this should fail because of the affinity
> > chrt -d -T 5000000 -P 16666666 -p 0 $mypid
> chrt: failed to set pid 560's policy: Operation not permitted
>
> // Offline CPU 3, so the rd now covers CPUs 0-2 only
> > echo 0 > /sys/devices/system/cpu/cpu3/online
> [ 400.843830] CPU3: shutdown
> [ 400.844100] psci: CPU3 killed (polled 0 ms)
>
> // Try to admit the task again, which now succeeds
> > chrt -d -T 5000000 -P 16666666 -p 0 $mypid
>
> // Plug CPU3 back online
> > echo 1 > /sys/devices/system/cpu/cpu3/online
> [ 408.819337] Detected PIPT I-cache on CPU3
> [ 408.819642] GICv3: CPU3: found redistributor 3 region 0:0x0000000008100000
> [ 408.820165] CPU3: Booted secondary processor 0x0000000003 [0x410fd083]
>
> I don't see any easy way to fix this w/o iterating over all deadline
> tasks in the rd when hotplugging a CPU back on, and blocking the hotplug
> operation if it'll cause affinity issues. Urgh.
>
Yeah this looks like a plain existing bug, joy. :)
We fixed a few around AC lately, but I guess work wasn't complete.
Thanks,
Juri
Powered by blists - more mailing lists