lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANLsYkyChXrqjZTG61SSi_AjB3f=Yo+H2xSWTERK+RtqKtT-+w@mail.gmail.com>
Date:   Mon, 5 Feb 2018 11:58:37 -0700
From:   Mathieu Poirier <mathieu.poirier@...aro.org>
To:     Juri Lelli <juri.lelli@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Li Zefan <lizefan@...wei.com>, Ingo Molnar <mingo@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Claudio Scordino <claudio@...dence.eu.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        Tommaso Cucinotta <tommaso.cucinotta@...tannapisa.it>,
        "luca.abeni" <luca.abeni@...tannapisa.it>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH V2 3/7] sched/deadline: Keep new DL task within root
 domain's boundary

On 2 February 2018 at 07:35, Juri Lelli <juri.lelli@...hat.com> wrote:
> Hi Mathieu,
>
> On 01/02/18 09:51, Mathieu Poirier wrote:
>> When considering to move a task to the DL policy we need to make sure
>> the CPUs it is allowed to run on matches the CPUs of the root domains of
>> the runqueue it is currently assigned to.  Otherwise the task will be
>> allowed to roam on CPUs outside of this root domain, something that will
>> skew system deadline statistics and potentially lead to over selling DL
>> bandwidth.
>>
>> For example say we have a 4 core system split in 2 cpuset: set1 has CPU 0
>> and 1 while set2 has CPU 2 and 3.  This results in 3 cpuset - the default
>> set that has all 4 CPUs along with set1 and set2 as just depicted.  We also
>> have task A that hasn't been assigned to any CPUset and as such, is part of
>> the default CPUset.
>>
>> At the time we want to move task A to a DL policy it has been assigned to
>> CPU1.  Since CPU1 is part of set1 the root domain will have 2 CPUs in it
>> and the bandwidth constraint checked against the current DL bandwidth
>> allotment of those 2 CPUs.
>
> Wait.. I'm confused. :)

Rightly so - it is confusing.

>
> Do you disabled cpuset.sched_load_balance in the root (default) cpuset?

Correct.  I was trying to be as clear as possible but also avoid
writing too much - I'll make that fact explicit in the next revision.

> If yes, we would end up with 2 root domains and if task A happens to be
> on root domain (0-1) checking its admission against 2 CPUs looks like
> the right thing to do to me.

So the task is running on CPU1 and as such admission control will be
done against root domain (0-1).  The problem here is that task A isn't
part of set1 (hence root domain (0-1)), it is part of the default
cpuset and that set also includes root domain (2-3) - and that is a
problem.


> If no, then there is a single root domain
> (the root/deafult one) with 4 CPUs, and it indeed seems that we've
> probably got a problem: it is possible for a DEADLINE task running on
> root/default cpuset to be put in (for example) 0-1 cpuset, and so
> restrict its affinity. Is it this that this patch cures?

That is exactly what this patch does.  It will prevent a task from
being promoted to DL if it is part of a cpuset (any cpuset) that has
its cpuset.sched_load_balance flag disabled and also has populated
children cpusets.  That way we prevent tasks from spanning multiple
root domains.

>
> Anyway, see more comments below..
>
> [...]
>
>>       /*
>> +      * If setscheduling to SCHED_DEADLINE we need to make sure the task
>> +      * is constrained to run within the root domain it is associated with,
>> +      * something that isn't guaranteed when using cpusets.
>> +      *
>> +      * Speaking of cpusets, we also need to assert that a task's
>> +      * cpus_allowed mask equals its cpuset's cpus_allowed mask. Otherwise
>> +      * a DL task could be assigned to a cpuset that has more CPUs than the
>> +      * root domain it is associated with, a situation that yields no
>> +      * benefits and greatly complicate the management of DL task when
>> +      * cpusets are present.
>> +      */
>> +     if (dl_policy(policy)) {
>> +             struct root_domain *rd = cpu_rq(task_cpu(p))->rd;
>
> I fear root_domain doesn't exist on UP.
>
> Maybe this logic can be put above changing the check we already do
> against the span?

Yes, indeed.  I'll fix that.

>
> https://elixir.free-electrons.com/linux/latest/source/kernel/sched/core.c#L4174
>
> Best,
>
> - Juri

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ