[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fe35049b-eef0-580f-aac6-b525a5e7f2a5@arm.com>
Date: Thu, 27 Feb 2020 15:15:37 +0000
From: Dietmar Eggemann <dietmar.eggemann@....com>
To: Vincent Guittot <vincent.guittot@...aro.org>,
Tao Zhou <zhout@...aldi.net>
Cc: Ben Segall <bsegall@...gle.com>, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Juri Lelli <juri.lelli@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mel Gorman <mgorman@...e.de>,
linux-kernel <linux-kernel@...r.kernel.org>,
Phil Auld <pauld@...hat.com>, Parth Shah <parth@...ux.ibm.com>,
Valentin Schneider <valentin.schneider@....com>,
Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH] sched/fair: fix runnable_avg for throttled cfs
On 27.02.20 13:12, Vincent Guittot wrote:
> On Thu, 27 Feb 2020 at 14:10, Tao Zhou <zhout@...aldi.net> wrote:
>>
>> Hi Dietmar,
>>
>> On Thu, Feb 27, 2020 at 11:20:05AM +0000, Dietmar Eggemann wrote:
>>> On 26.02.20 21:01, Vincent Guittot wrote:
>>>> On Wed, 26 Feb 2020 at 20:04, <bsegall@...gle.com> wrote:
>>>>>
>>>>> Vincent Guittot <vincent.guittot@...aro.org> writes:
>>>>>
>>>>>> When a cfs_rq is throttled, its group entity is dequeued and its running
>>>>>> tasks are removed. We must update runnable_avg with current h_nr_running
>>>>>> and update group_se->runnable_weight with new h_nr_running at each level
>>>
>>> ^^^
>>>
>>> Shouldn't his be 'curren' rather 'new' h_nr_running for
>>> group_se->runnable_weight? IMHO, you want to cache the current value
>>> before you add/subtract task_delta.
>>
>> /me think Vincent is right. h_nr_running is updated in the previous
>> level or out. The next level will use current h_nr_running to update
>> runnable_avg and use the new group cfs_rq's h_nr_running which was
>> updated in the previous level or out to update se runnable_weight.
Ah OK, 'old' as in 'old' cached value se->runnable_weight and 'new' as
the 'new' se->runnable_weight which gets updated *after* update_load_avg
and before +/- task_delta.
So when we throttle e.g. /tg1/tg11
previous level is: /tg1/tg11
next level: /tg1
loop for /tg1:
for_each_sched_entity(se)
cfs_rq = cfs_rq_of(se);
update_load_avg(cfs_rq, se ...) <-- uses 'old' se->runnable_weight
se->runnable_weight = se->my_q->h_nr_running <-- 'new' value
(updated in previous
level, group cfs_rq)
[...]
Powered by blists - more mailing lists