[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250930123940.GA643@bytedance>
Date: Tue, 30 Sep 2025 20:39:40 +0800
From: Aaron Lu <ziqianlu@...edance.com>
To: K Prateek Nayak <kprateek.nayak@....com>
Cc: Valentin Schneider <vschneid@...hat.com>,
Ben Segall <bsegall@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
Chengming Zhou <chengming.zhou@...ux.dev>,
Josh Don <joshdon@...gle.com>, Ingo Molnar <mingo@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Xi Wang <xii@...gle.com>, linux-kernel@...r.kernel.org,
Juri Lelli <juri.lelli@...hat.com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>, Mel Gorman <mgorman@...e.de>,
Chuyi Zhou <zhouchuyi@...edance.com>,
Jan Kiszka <jan.kiszka@...mens.com>,
Florian Bezdeka <florian.bezdeka@...mens.com>,
Songtang Liu <liusongtang@...edance.com>,
Chen Yu <yu.c.chen@...el.com>,
Matteo Martelli <matteo.martelli@...ethink.co.uk>,
Michal Koutný <mkoutny@...e.com>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH] sched/fair: Prevent cfs_rq from being unthrottled with
zero runtime_remaining
On Tue, Sep 30, 2025 at 07:07:17PM +0800, Aaron Lu wrote:
> On Tue, Sep 30, 2025 at 02:28:16PM +0530, K Prateek Nayak wrote:
> > Hello Aaron,
> >
> > On 9/30/2025 1:26 PM, Aaron Lu wrote:
> > > On Mon, Sep 29, 2025 at 03:04:03PM +0530, K Prateek Nayak wrote:
> > > ... ...
> > >> Can we instead do a check_enqueue_throttle() in enqueue_throttled_task()
> > >> if we find cfs_rq->throttled_limbo_list to be empty?
> > >>
> > >> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > >> index 18a30ae35441..fd2d4dad9c27 100644
> > >> --- a/kernel/sched/fair.c
> > >> +++ b/kernel/sched/fair.c
> > >> @@ -5872,6 +5872,8 @@ static bool enqueue_throttled_task(struct task_struct *p)
> > >> */
> > >> if (throttled_hierarchy(cfs_rq) &&
> > >> !task_current_donor(rq_of(cfs_rq), p)) {
> > > /*
> > > * Make sure to throttle this cfs_rq or it can be unthrottled
> > > * with no runtime_remaining and gets throttled again on its
> > > * unthrottle path.
> > > */
> > >> + if (list_empty(&cfs_rq->throttled_limbo_list))
> > >> + check_enqueue_throttle(cfs_rq);
> > >
> > > BTW, do you think a comment is needed? Something like the above, not
> > > sure if it's too redundant though, feel free to let me know your
> > > thoughts, thanks.
> >
> > Now that I'm looking at it again, I think we should actually do a:
> >
> > for_each_entity(se)
> > check_enqueue_throttle(cfs_rq_of(se));
> >
> > The reason being, we can have:
> >
> > root -> A (throttled) -> B -> C
> >
> > Consider B has runtime_remaining = 0, and subsequently a throttled task
> > is queued onto C. Ideally, we should start the B/W timer for B at that
> > point but we bail out after queuing it on C. Thoughts?
>
> Yes agree the B/W timer should also be considered.
On another thought, do we really need care about B/W timer for B?
I mean, when C is unthrottled and gets enqueued on B,
check_enqueue_throttle() will do the right thing for B so I don't
think we need to do this hierarchy check_enqueue_throttle() here.
I think the only difference with your suggestion and my patch is, with
your suggestion, it's possible for a runtime_enabled cfs_rq to reach
tg_unthrottle_up() with runtime_remaining equals to 0 but since it
doesn't have any tasks in its limbo list, it will not do any enqueue so
won't possibly trigger throttle there, so it's still fine. i.e. I think
your original suggestion works.
Powered by blists - more mailing lists