linux-kernel - Re: [PATCH v4 3/5] sched/fair: Switch to task based throttle model

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <xm26frcwtgz2.fsf@google.com>
Date: Mon, 08 Sep 2025 20:58:09 -0700
From: Benjamin Segall <bsegall@...gle.com>
To: Aaron Lu <ziqianlu@...edance.com>
Cc: K Prateek Nayak <kprateek.nayak@....com>,  Peter Zijlstra
 <peterz@...radead.org>,  Valentin Schneider <vschneid@...hat.com>,
  Chengming Zhou <chengming.zhou@...ux.dev>,  Josh Don
 <joshdon@...gle.com>,  Ingo Molnar <mingo@...hat.com>,  Vincent Guittot
 <vincent.guittot@...aro.org>,  Xi Wang <xii@...gle.com>,
  linux-kernel@...r.kernel.org,  Juri Lelli <juri.lelli@...hat.com>,
  Dietmar Eggemann <dietmar.eggemann@....com>,  Steven Rostedt
 <rostedt@...dmis.org>,  Mel Gorman <mgorman@...e.de>,  Chuyi Zhou
 <zhouchuyi@...edance.com>,  Jan Kiszka <jan.kiszka@...mens.com>,  Florian
 Bezdeka <florian.bezdeka@...mens.com>,  Songtang Liu
 <liusongtang@...edance.com>,  Chen Yu <yu.c.chen@...el.com>,  Matteo
 Martelli <matteo.martelli@...ethink.co.uk>,  Michal Koutn??
 <mkoutny@...e.com>,  Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Subject: Re: [PATCH v4 3/5] sched/fair: Switch to task based throttle model

Aaron Lu <ziqianlu@...edance.com> writes:

> On Thu, Sep 04, 2025 at 03:21:06PM +0530, K Prateek Nayak wrote:
>> Hello Aaron,
>> 
>> On 9/4/2025 1:46 PM, Aaron Lu wrote:
>> > @@ -8722,15 +8730,6 @@ static void check_preempt_wakeup_fair(struct rq *rq, struct task_struct *p, int
>> >  	if (unlikely(se == pse))
>> >  		return;
>> >  
>> > -	/*
>> > -	 * This is possible from callers such as attach_tasks(), in which we
>> > -	 * unconditionally wakeup_preempt() after an enqueue (which may have
>> > -	 * lead to a throttle).  This both saves work and prevents false
>> > -	 * next-buddy nomination below.
>> > -	 */
>> > -	if (unlikely(throttled_hierarchy(cfs_rq_of(pse))))
>> > -		return;
>> 
>> I think we should have a:
>> 
>> 	if (task_is_throttled(p))
>> 		return;
>> 
>> here. I can see at least one possibility via prio_changed_fair()
>
> Ah right. I didn't realize wakeup_preempt() can be called for a throttled
> task, I think it is not expected. What about forbid that :)
> (not tested in anyway, just to show the idea and get feedback)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index cb93e74a850e8..f1383aede764f 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -13135,7 +13135,11 @@ static void task_fork_fair(struct task_struct *p)
>  static void
>  prio_changed_fair(struct rq *rq, struct task_struct *p, int oldprio)
>  {
> -	if (!task_on_rq_queued(p))
> +	/*
> +	 * p->on_rq can be set for throttled task but there is no need to
> +	 * check wakeup preempt for throttled task, so use p->se.on_rq instead.
> +	 */
> +	if (!p->se.on_rq)
>  		return;
>  
>  	if (rq->cfs.nr_queued == 1)
>
>> where a throttled task might reach here. Rest looks good. I'll
>> still wait on Ben for the update_cfs_group() bits :)

Yeah, I think I agree with all of these (this patch and the previous
patch); the preempt ones are subjective but I'd probably default to "no
special case needed for throttle". Removing the check in
update_cfs_group() I think is correct, unless we want to freeze
everything, yeah. (And that seems dangerous in its own way)