[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <jhjblo2vx60.mognet@arm.com>
Date: Wed, 08 Apr 2020 16:01:43 +0100
From: Valentin Schneider <valentin.schneider@....com>
To: luca abeni <luca.abeni@...tannapisa.it>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Steven Rostedt <rostedt@...dmis.org>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Wei Wang <wvw@...gle.com>, Quentin Perret <qperret@...gle.com>,
Alessio Balsini <balsini@...gle.com>,
Pavan Kondeti <pkondeti@...eaurora.org>,
Patrick Bellasi <patrick.bellasi@...bug.net>,
Morten Rasmussen <morten.rasmussen@....com>,
Qais Yousef <qais.yousef@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] sched/deadline: Improve admission control for asymmetric CPU capacities
On 08/04/20 14:30, luca abeni wrote:
>>
>> I don't think this is strictly equivalent to what we have now for the
>> SMP case. 'cpus' used to come from dl_bw_cpus(), which is an ugly way
>> of writing
>>
>> cpumask_weight(rd->span AND cpu_active_mask);
>>
>> The rd->cpu_capacity_orig field you added gets set once per domain
>> rebuild, so it also happens in sched_cpu_(de)activate() but is
>> separate from touching cpu_active_mask. AFAICT this mean we can
>> observe a CPU as !active but still see its capacity_orig accounted in
>> a root_domain.
>
> Sorry, I suspect this is my fault, because the bug comes from my
> original patch.
> When I wrote the original code, I believed that when a CPU is
> deactivated it is also removed from its root domain.
>
> I now see that I was wrong.
>
Well it is indeed the case, but sadly it's not an atomic step - AFAICT with
cpusets we do hold some cpuset lock when calling __dl_overflow() and when
rebuilding the domains, but not when fiddling with the active mask.
I just realized it's even more obvious for dl_cpu_busy(): IIUC it is meant
to prevent the removal of a CPU if it would lead to a DL overflow - it
works now because the active mask is modified before it gets called, but
here it breaks because it's called before the sched_domain rebuild.
Perhaps re-computing the root domain capacity sum at every dl_bw_cpus()
call would be simpler. It's a bit more work, but then we already have a
for_each_cpu_*() loop, and we only rely on the masks being correct.
>
> Luca
Powered by blists - more mailing lists