[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <84382429-02c1-12d5-bdf4-23e880246cf3@gmail.com>
Date: Tue, 14 Oct 2025 19:01:15 +0800
From: Hao Jia <jiahao.kernel@...il.com>
To: Aaron Lu <ziqianlu@...edance.com>
Cc: Valentin Schneider <vschneid@...hat.com>, Ben Segall
<bsegall@...gle.com>, K Prateek Nayak <kprateek.nayak@....com>,
Peter Zijlstra <peterz@...radead.org>,
Chengming Zhou <chengming.zhou@...ux.dev>, Josh Don <joshdon@...gle.com>,
Ingo Molnar <mingo@...hat.com>, Vincent Guittot
<vincent.guittot@...aro.org>, Xi Wang <xii@...gle.com>,
linux-kernel@...r.kernel.org, Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Mel Gorman <mgorman@...e.de>,
Chuyi Zhou <zhouchuyi@...edance.com>, Jan Kiszka <jan.kiszka@...mens.com>,
Florian Bezdeka <florian.bezdeka@...mens.com>,
Songtang Liu <liusongtang@...edance.com>, Chen Yu <yu.c.chen@...el.com>,
Matteo Martelli <matteo.martelli@...ethink.co.uk>,
Michal Koutný <mkoutny@...e.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH] sched/fair: Prevent cfs_rq from being unthrottled with
zero runtime_remaining
Hello Aaron,
Thank you for your reply.
On 2025/10/14 17:11, Aaron Lu wrote:
> Hi Hao,
>
> On Tue, Oct 14, 2025 at 03:43:10PM +0800, Hao Jia wrote:
>>
>> Hello Aaron,
>>
>> On 2025/9/29 15:46, Aaron Lu wrote:
>>> When a cfs_rq is to be throttled, its limbo list should be empty and
>>> that's why there is a warn in tg_throttle_down() for non empty
>>> cfs_rq->throttled_limbo_list.
>>>
>>> When running a test with the following hierarchy:
>>>
>>> root
>>> / \
>>> A* ...
>>> / | \ ...
>>> B
>>> / \
>>> C*
>>>
>>> where both A and C have quota settings, that warn on non empty limbo list
>>> is triggered for a cfs_rq of C, let's call it cfs_rq_c(and ignore the cpu
>>> part of the cfs_rq for the sake of simpler representation).
>>>
>>
>> I encountered a similar warning a while ago and fixed it. I have a question
>> I'd like to ask. tg_unthrottle_up(cfs_rq_C) calls enqueue_task_fair(p) to
>> enqueue a task, which requires that the runtime_remaining of task p's entire
>> task_group hierarchy be greater than 0.
>>
>> In addition to the case you fixed above,
>> When bandwidth is running normally, Is it possible that there's a corner
>> case where cfs_A->runtime_remaining > 0, but cfs_B->runtime_remaining < 0
>> could trigger a similar warning?
>
> Do you mean B also has quota set and cfs_B's runtime_remaining < 0?
> In this case, B should be throttled and C is a descendent of B so should
> also be throttled, i.e. C can't be unthrottled when B is in throttled
> state. Do I understand you correctly?
>
Yes, both A and B have quota set.
Is there a possible corner case?
Asynchronous unthrottling causes other running entities to completely
consume cfs_B->runtime_remaining (cfs_B->runtime_remaining < 0) but not
completely consume cfs_A->runtime_remaining (cfs_A->runtime_remaining >
0) when we call unthrottle_cfs_rq(cfs_rq_A) .
When we unthrottle_cfs_rq(cfs_rq_A), cfs_A->runtime_remaining > 0, but
if cfs_B->runtime_remaining < 0 at this time,
therefore, when
enqueue_task_fair(p)->check_enqueue_throttle(cfs_rq_B)->throttle_cfs_rq(cfs_rq_B),
an warnning may be triggered.
My core question is:
When we call unthrottle_cfs_rq(cfs_rq_A), we only check
cfs_rq_A->runtime_remaining. However,
enqueue_task_fair(p)->enqueue_entity(C->B->A)->check_enqueue_throttle()
does require that the runtime_remaining of each task_group level of task
p is greater than 0.
Can we guarantee this?
Thanks,
Hao
>>
>> So, I previously tried to fix this issue using the following code, adding
>> the ENQUEUE_THROTTLE flag to ensure that tasks enqueued in
>> tg_unthrottle_up() aren't throttled.
>>
>
> Yeah I think this can also fix the warning.
> I'm not sure if it is a good idea though, because on unthrottle, the
> expectation is, this cfs_rq should have runtime_remaining > 0 and if
> it's not the case, I think it is better to know why.
>
> Thanks.
Powered by blists - more mailing lists