[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <274ebb9a-9328-e312-f554-34da8b183932@arm.com>
Date:   Thu, 20 Feb 2020 14:38:59 +0100
From:   Dietmar Eggemann <dietmar.eggemann@....com>
To:     Vincent Guittot <vincent.guittot@...aro.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
        linux-kernel <linux-kernel@...r.kernel.org>,
        Phil Auld <pauld@...hat.com>, Parth Shah <parth@...ux.ibm.com>,
        Valentin Schneider <valentin.schneider@....com>,
        Hillf Danton <hdanton@...a.com>
Subject: Re: [PATCH v2 1/5] sched/fair: Reorder enqueue/dequeue_task_fair path
On 19/02/2020 17:26, Vincent Guittot wrote:
> On Wed, 19 Feb 2020 at 12:07, Dietmar Eggemann <dietmar.eggemann@....com> wrote:
>>
>> On 18/02/2020 15:15, Vincent Guittot wrote:
>>> On Tue, 18 Feb 2020 at 14:22, Peter Zijlstra <peterz@...radead.org> wrote:
>>>>
>>>> On Tue, Feb 18, 2020 at 01:37:37PM +0100, Dietmar Eggemann wrote:
>>>>> On 14/02/2020 16:27, Vincent Guittot wrote:
>>>>>> The walk through the cgroup hierarchy during the enqueue/dequeue of a task
>>>>>> is split in 2 distinct parts for throttled cfs_rq without any added value
>>>>>> but making code less readable.
>>>>>>
>>>>>> Change the code ordering such that everything related to a cfs_rq
>>>>>> (throttled or not) will be done in the same loop.
>>>>>>
>>>>>> In addition, the same steps ordering is used when updating a cfs_rq:
>>>>>> - update_load_avg
>>>>>> - update_cfs_group
>>>>>> - update *h_nr_running
>>>>>
>>>>> Is this code change really necessary? You pay with two extra goto's. We
>>>>> still have the two for_each_sched_entity(se)'s because of 'if
>>>>> (se->on_rq); break;'.
>>>>
>>>> IIRC he relies on the presented ordering in patch #5 -- adding the
>>>> running_avg metric.
>>>
>>> Yes, that's the main reason, updating load_avg before h_nr_running
>>
>> My hunch is you refer to the new function:
>>
>> static inline void se_update_runnable(struct sched_entity *se)
>> {
>>         if (!entity_is_task(se))
>>                 se->runnable_weight = se->my_q->h_nr_running;
>> }
>>
>> I don't see the dependency to the 'update_load_avg -> h_nr_running'
>> order since it operates on se->my_q, not cfs_rq = cfs_rq_of(se), i.e.
>> se->cfs_rq.
>>
>> What do I miss here?
> 
> update_load_avg() updates both se and cfs_rq so if you update
> cfs_rq->h_nr_running before calling update_load_avg() like in the 2nd
> for_each_sched_entity, you will update cfs_rq runnable_avg for the
> past time slot with the new h_nr_running value instead of the previous
> value.
Ah, now I see:
update_load_avg()
  update_cfs_rq_load_avg()
    __update_load_avg_cfs_rq()
       ___update_load_sum(..., cfs_rq->h_nr_running, ...)
                               ^^^^^^^^^^^^^^^^^^^^
Not really obvious IMHO, since the code is introduced only in 4/5.
Could you add a comment to this patch header?
I see you mentioned this dependency already in v1 discussion
https://lore.kernel.org/r/CAKfTPtAM=kgF7Fz-JKFY+s_k5KFirs-8Bub3s1Eqtq7P0NMa0w@mail.gmail.com
"... But the following patches make PELT using h_nr_running ...".
IMHO it would be helpful to have this explanation in the 1/5 patch
header so people stop wondering why this is necessary.
Powered by blists - more mailing lists
 
