linux-kernel - Re: [PATCH 2/4] sched/deadline: Improve admission control for asymmetric CPU capacities

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <31620965-e1e7-6854-ad46-8192ee4b41af@arm.com>
Date:   Thu, 9 Apr 2020 19:29:45 +0200
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Valentin Schneider <valentin.schneider@....com>,
        luca abeni <luca.abeni@...tannapisa.it>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Wei Wang <wvw@...gle.com>, Quentin Perret <qperret@...gle.com>,
        Alessio Balsini <balsini@...gle.com>,
        Pavan Kondeti <pkondeti@...eaurora.org>,
        Patrick Bellasi <patrick.bellasi@...bug.net>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Qais Yousef <qais.yousef@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] sched/deadline: Improve admission control for
 asymmetric CPU capacities

On 08.04.20 17:01, Valentin Schneider wrote:
> 
> On 08/04/20 14:30, luca abeni wrote:
>>>
>>> I don't think this is strictly equivalent to what we have now for the
>>> SMP case. 'cpus' used to come from dl_bw_cpus(), which is an ugly way
>>> of writing
>>>
>>>      cpumask_weight(rd->span AND cpu_active_mask);
>>>
>>> The rd->cpu_capacity_orig field you added gets set once per domain
>>> rebuild, so it also happens in sched_cpu_(de)activate() but is
>>> separate from touching cpu_active_mask. AFAICT this mean we can
>>> observe a CPU as !active but still see its capacity_orig accounted in
>>> a root_domain.
>>
>> Sorry, I suspect this is my fault, because the bug comes from my
>> original patch.
>> When I wrote the original code, I believed that when a CPU is
>> deactivated it is also removed from its root domain.
>>
>> I now see that I was wrong.
>>
> 
> Well it is indeed the case, but sadly it's not an atomic step - AFAICT with
> cpusets we do hold some cpuset lock when calling __dl_overflow() and when
> rebuilding the domains, but not when fiddling with the active mask.
> 
> I just realized it's even more obvious for dl_cpu_busy(): IIUC it is meant
> to prevent the removal of a CPU if it would lead to a DL overflow - it
> works now because the active mask is modified before it gets called, but
> here it breaks because it's called before the sched_domain rebuild.
> 
> Perhaps re-computing the root domain capacity sum at every dl_bw_cpus()
> call would be simpler. It's a bit more work, but then we already have a
> for_each_cpu_*() loop, and we only rely on the masks being correct.

Maybe we can do a hybrid. We have rd->span and rd->sum_cpu_capacity and
with the help of an extra per-cpu cpumask we could just

DEFINE_PER_CPU(cpumask_var_t, dl_bw_mask);

dl_bw_cpus(int i) {

    struct cpumask *cpus = this_cpu_cpumask_var_ptr(dl_bw_mask);
    ...
    cpumask_and(cpus, rd->span, cpu_active_mask);

    return cpumask_weight(cpus);
}

and

dl_bw_capacity(int i) {

    struct cpumask *cpus = this_cpu_cpumask_var_ptr(dl_bw_mask);
    ...
    cpumask_and(cpus, rd->span, cpu_active_mask);
    if (cpumask_equal(cpus, rd->span))
        return rd->sum_cpu_capacity;

    for_each_cpu(i, cpus)
        cap += capacity_orig_of(i);

    return cap;
}

So only in cases in which rd->span and cpu_active_mask differ we would
have to sum up again.