[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <204d67f1-a21c-9d71-9b76-6f1a11c89092@arm.com>
Date: Tue, 12 May 2020 14:39:13 +0200
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Steven Rostedt <rostedt@...dmis.org>,
Luca Abeni <luca.abeni@...tannapisa.it>,
Daniel Bristot de Oliveira <bristot@...hat.com>,
Wei Wang <wvw@...gle.com>, Quentin Perret <qperret@...gle.com>,
Alessio Balsini <balsini@...gle.com>,
Pavan Kondeti <pkondeti@...eaurora.org>,
Patrick Bellasi <patrick.bellasi@...bug.net>,
Morten Rasmussen <morten.rasmussen@....com>,
Valentin Schneider <valentin.schneider@....com>,
Qais Yousef <qais.yousef@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 3/6] sched/deadline: Add dl_bw_capacity()
On 11/05/2020 10:01, Juri Lelli wrote:
> On 06/05/20 17:09, Dietmar Eggemann wrote:
>> On 06/05/2020 14:37, Juri Lelli wrote:
>>> On 06/05/20 12:54, Dietmar Eggemann wrote:
>>>> On 27/04/2020 10:37, Dietmar Eggemann wrote:
[...]
>>> to say that we actually want to check new tasks bw requirement against
>>> the available bandwidth of the particular CPU they happen to be running
>>> (and will continue to run) when setscheduler is called.
>>
>> By 'available bandwidth of the particular CPU' you refer to
>> '\Sum_{cpu_rq(i)->rd->span} CPU capacity', right?
>
> No. I was referring to the single CPU capacity. The capacity of the CPU
> where a task is running when setscheduler is called for it (and DL AC
> performed). See below, maybe more clear why I wondered about this case..
OK, got it! I was just confused since I don't think that this patch
introduced the issue.
Before the patch 'int cpus = dl_bw_cpus(task_cpu(p))' was used which
returns the number of cpus on the (default) rd (n). So for a single CPU
(1024) we use n*1024.
I wonder if a fix for that should be part of this patch-set?
[...]
>> ...
>> [ 144.920102] __dl_bw_capacity CPU3 rd->span=3-5 return 1338
>> [ 144.925607] sched_dl_overflow: [bash 1999] task_cpu(p)=3 cap=1338 cpus_ptr=3-5
>
> So, here you are checking new task bw against 1338 which is 3*L
> capacity. However, since load balance is disabled at this point for 3-5,
> once admitted the task will only be able to run on CPU 3. Now, if more
> tasks on CPU 3 are admitted the same way (up to 1338) I believe they
> will start to experience deadline misses because only 446 will be
> actually available to them until load balance is enabled below and they
> are then free to migrate to CPUs 4 and 5.
>
> Does it makes sense?
Yes, it does.
So my first idea was to only consider the CPU (i.e. its CPU capacity) in
case we detect 'cpu_rq(cpu)->rd == def_root_domain'?
In case I re-enable load-balancing on cpuset '/', we can't make a task
in cpuset 'B' DL since we hit this in __sched_setscheduler():
4931 /*
4932 * Don't allow tasks with an affinity mask smaller than
4933 * the entire root_domain to become SCHED_DEADLINE.
...
4935 */
4936 if (!cpumask_subset(span, p->cpus_ptr) || ...
root@...o:~# echo 1 > /sys/fs/cgroup/cpuset/cpuset.sched_load_balance
root@...o:~# echo $$ > /sys/fs/cgroup/cpuset/B/tasks
root@...o:~# chrt -d --sched-runtime 8000 --sched-period 16000 -p 0 $$
chrt: failed to set pid 2316's policy: Operation not permitted
So this task has to leave 'B' first I assume.
[...]
Powered by blists - more mailing lists