lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20251202094322.GA3378032@bytedance.com>
Date: Tue, 2 Dec 2025 17:43:22 +0800
From: "Aaron Lu" <ziqianlu@...edance.com>
To: "Bezdeka,  Florian" <florian.bezdeka@...mens.com>
Cc: "bsegall@...gle.com" <bsegall@...gle.com>, 
	"vschneid@...hat.com" <vschneid@...hat.com>, 
	"xii@...gle.com" <xii@...gle.com>, 
	"chengming.zhou@...ux.dev" <chengming.zhou@...ux.dev>, 
	"mingo@...hat.com" <mingo@...hat.com>, 
	"joshdon@...gle.com" <joshdon@...gle.com>, 
	"vincent.guittot@...aro.org" <vincent.guittot@...aro.org>, 
	"kprateek.nayak@....com" <kprateek.nayak@....com>, 
	"peterz@...radead.org" <peterz@...radead.org>, 
	"bigeasy@...utronix.de" <bigeasy@...utronix.de>, 
	"yu.c.chen@...el.com" <yu.c.chen@...el.com>, 
	"dietmar.eggemann@....com" <dietmar.eggemann@....com>, 
	"rostedt@...dmis.org" <rostedt@...dmis.org>, 
	"juri.lelli@...hat.com" <juri.lelli@...hat.com>, 
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, 
	"mkoutny@...e.com" <mkoutny@...e.com>, 
	"mgorman@...e.de" <mgorman@...e.de>, 
	"zhouchuyi@...edance.com" <zhouchuyi@...edance.com>,  "Kiszka, 
	 Jan" <jan.kiszka@...mens.com>, 
	"liusongtang@...edance.com" <liusongtang@...edance.com>, 
	"matteo.martelli@...ethink.co.uk" <matteo.martelli@...ethink.co.uk>
Subject: Re: [PATCH v4 0/5] Defer throttle when task exits to user

On Tue, Dec 02, 2025 at 08:59:15AM +0000, Bezdeka, Florian wrote:
> On Fri, 2025-08-29 at 16:11 +0800, Aaron Lu wrote:
> > v4:
> > - Add cfs_bandwidth_used() in task_is_throttled() and remove unlikely
> >   for task_is_throttled(), suggested by Valetin Schneider;
> > - Add a warn for non empty throttle_node in enqueue_throttled_task(),
> >   suggested by Valetin Schneider;
> > - Improve comments in enqueue_throttled_task() by Valetin Schneider;
> > - Clear throttled for to-be-unthrottled tasks in tg_unthrottle_up();
> > - Change throttled and pelt_clock_throttled fields in cfs_rq from int to
> >   bool, reported by LKP;
> > - Improve changelog for patch4 by Valetin Schneider.
> > 
> > Thanks a lot for all the reviews and tests, I hope I didn't miss any of
> > them but if I do, please let me know. I've also run Jan's rt reproducer
> > and songtang's stress test and didn't notice any problem.
> > 
> > Apply on top of sched/core, head commit 1b5f1454091e("sched/idle: Remove
> > play_idle()").
> > 
> 
> Hi all,
> 
> as this all has arrived in 6.18 now - thanks for all the work - I would
> like to start a discussion about backporting this series - and some more
> related work, see below - to older stable releases. Especially
> PREEMPT_RT enabled systems are of interest as this series fixes a
> serious system freeze.
> 
> Has someone already looked into the backporting topic?
> 
> I can remember from the previous discussion that everything below 6.12
> is hard, as scheduler internals have changed (EEVDF, vlag). Still, 6.12
> would be valuable.
> 
> I have the following commits on my radar:
> 
> This series:
> 
> 2cd571245b43 ("sched/fair: Add related data structure for task based throttle")
> 7fc2d1439247 ("sched/fair: Implement throttle task work and related helpers")
> e1fad12dcb66 ("sched/fair: Switch to task based throttle model")
> eb962f251fbb ("sched/fair: Task based throttle time accounting")
> 5b726e9bf954 ("sched/fair: Get rid of throttled_lb_pair()")
> 
> Follow up series:
> https://lore.kernel.org/all/20250910095044.278-1-ziqianlu@bytedance.com/
> 
> fe8d238e646e ("sched/fair: Propagate load for throttled cfs_rq")
> fcd394866e3d ("sched/fair: update_cfs_group() for throttled cfs_rqs")
> 253b3f587241 ("sched/fair: Do not special case tasks in throttled hierarchy")
> 0d4eaf8caf8c ("sched/fair: Do not balance task to a throttled cfs_rq")
>

There is one more fix before the next fix:
https://lore.kernel.org/all/20251021053522.37583-1-kprateek.nayak@amd.com/

0e4a169d1a2b ("sched/fair: Start a cfs_rq on throttled hierarchy with
PELT clock throttled")

> Another follow up:
> https://lore.kernel.org/all/20250929074645.416-1-ziqianlu@bytedance.com/
> 
> 956dfda6a708 ("sched/fair: Prevent cfs_rq from being unthrottled with zero runtime_remaining")
> 
> 
> That should hopefully be enough, right?
> 

I think so.

> Any concerns, additional thoughts, missing peaces? Please let me know!

1 if the base does not have Josh's async unthrottle:
  8ad075c2eb1f ("sched: Async unthrottling for cfs bandwidth"),
  make sure to backport that too or the distribute runtime timer handler
  can be time consuming.

2 if the base uses cfs, in dequeue_throttled_task(), the task's vruntime
  has to be adjusted like below:

static void dequeue_throttled_task(struct task_struct *p, int flags)
{
	WARN_ON_ONCE(p->se.on_rq);
	list_del_init(&p->throttle_node);

	/* task blocked after throttled */
	if (flags & DEQUEUE_SLEEP)
		p->throttled = false;
	else {
		struct sched_entity *se = &p->se;
		struct cfs_rq *cfs_rq;

		/*
		 * We are leaving this cfs_rq but our vruntime is not
		 * normalized yet as that is only done for tasks dequeued
		 * with !DEQUEUE_SLEEP in dequeue_entity(), so we have to:
		 * Fix up our vruntime so that the current sleep doesn't
		 * cause 'unlimited' sleep bonus.
		 */
		cfs_rq = cfs_rq_of(se);
		place_entity(cfs_rq, se, 0);
		se->vruntime -= cfs_rq->min_vruntime;
	}
}

3 Also in this dequeue_throttled_task() function, if the base doesn't
  have commit e1f078f50478("sched/fair: Combine detach into dequeue 
  when migrating task"), then it's not necessary to do the following
  because migrate_task_rq_fair() have already dealed with that:
	/*
	 * task is migrating off its old cfs_rq, detach
	 * the task's load from its old cfs_rq.
	 */
	if (task_on_rq_migrating(p))
		detach_task_cfs_rq(p);

That's what I can think of right now.

I did a backport for 5.15 based kernel, I can probably post it somewhere
if it is useful, just let me know.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ