lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250207111141.GD7145@noisy.programming.kicks-ass.net>
Date: Fri, 7 Feb 2025 12:11:41 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Breno Leitao <leitao@...ian.org>
Cc: mingo@...nel.org, vincent.guittot@...aro.org,
	linux-kernel@...r.kernel.org, juri.lelli@...hat.com,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, bristot@...hat.com, corbet@....net,
	qyousef@...alina.io, chris.hyser@...cle.com,
	patrick.bellasi@...bug.net, pjt@...gle.com, pavel@....cz,
	qperret@...gle.com, tim.c.chen@...ux.intel.com, joshdon@...gle.com,
	timj@....org, kprateek.nayak@....com, yu.c.chen@...el.com,
	youssefesmat@...omium.org, joel@...lfernandes.org, efault@....de,
	tglx@...utronix.de
Subject: Re: [PATCH 03/15] sched/fair: Add lag based placement

On Fri, Feb 07, 2025 at 02:07:18AM -0800, Breno Leitao wrote:
> Hello Peter,
> 
> On Wed, May 31, 2023 at 01:58:42PM +0200, Peter Zijlstra wrote:
> >
> >  place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
> >  {
> <snip>
> > -		vruntime -= thresh;
> > +		lag *= load + se->load.weight;
> > +		if (WARN_ON_ONCE(!load))
> 
> I have 6.13 running on some hosts, and in some cases, where the system
> is getting some OOMs, I see the following stack:
> 
>           WARNING: CPU: 29 PID: 593474 at kernel/sched/fair.c:5250 place_entity+0x199/0x1b0
> 
>            Call Trace:
>             <TASK>
>             ? place_entity+0x199/0x1b0
>             reweight_entity+0x188/0x200
>             enqueue_task_fair.llvm.15448040313737105663+0x28c/0x560
>             enqueue_task+0x30/0x120
>             ttwu_do_activate+0x99/0x230
>             try_to_wake_up+0x25a/0x4a0
>             ? hrtimer_dummy_timeout+0x10/0x10
>             hrtimer_wakeup+0x25/0x30
>             __hrtimer_run_queues+0xf1/0x250
>             hrtimer_interrupt+0xfb/0x220
>             __sysvec_apic_timer_interrupt+0x47/0x140
>             sysvec_apic_timer_interrupt+0x35/0x80
>             asm_sysvec_apic_timer_interrupt+0x16/0x20
> 
> I am sorry for not decoding the stack, but I am having a hard time
> decoding the stack properly. The values I got was misleading, and I am
> working to understand what is happening.
> 
> Anyway, I don't have a reproducer and this problem doesn't happen
> frequent enough. I have 1K hosts with 6.13 and I saw it 5 times in the
> last week.

Weird. Would you mind trying with the below patch on top?

---
Subject: sched/fair: Adhere to place_entity() constraints
From: Peter Zijlstra <peterz@...radead.org>
Date: Tue, 28 Jan 2025 15:39:49 +0100

Mike reports that commit 6d71a9c61604 ("sched/fair: Fix EEVDF entity
placement bug causing scheduling lag") relies on commit 4423af84b297
("sched/fair: optimize the PLACE_LAG when se->vlag is zero") to not
trip a WARN in place_entity().

What happens is that the lag of the very last entity is 0 per
definition -- the average of one element matches the value of that
element. Therefore place_entity() will match the condition skipping
the lag adjustment:

  if (sched_feat(PLACE_LAG) && cfs_rq->nr_queued && se->vlag) {

Without the 'se->vlag' condition -- it will attempt to adjust the zero
lag even though we're inserting into an empty tree.

Notably, we should have failed the 'cfs_rq->nr_queued' condition, but
don't because they didn't get updated.

Additionally, move update_load_add() after placement() as is
consistent with other place_entity() users -- this change is
non-functional, place_entity() does not use cfs_rq->load.

Fixes: 6d71a9c61604 ("sched/fair: Fix EEVDF entity placement bug causing scheduling lag")
Reported-by: Mike Galbraith <efault@....de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Cc: stable@...r.kernel.org
Link: https://lkml.kernel.org/r/20250128143949.GD7145@noisy.programming.kicks-ass.net
---
 kernel/sched/fair.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3781,6 +3781,7 @@ static void reweight_entity(struct cfs_r
 		update_entity_lag(cfs_rq, se);
 		se->deadline -= se->vruntime;
 		se->rel_deadline = 1;
+		cfs_rq->nr_queued--;
 		if (!curr)
 			__dequeue_entity(cfs_rq, se);
 		update_load_sub(&cfs_rq->load, se->load.weight);
@@ -3807,10 +3808,11 @@ static void reweight_entity(struct cfs_r
 
 	enqueue_load_avg(cfs_rq, se);
 	if (se->on_rq) {
-		update_load_add(&cfs_rq->load, se->load.weight);
 		place_entity(cfs_rq, se, 0);
+		update_load_add(&cfs_rq->load, se->load.weight);
 		if (!curr)
 			__enqueue_entity(cfs_rq, se);
+		cfs_rq->nr_queued++;
 
 		/*
 		 * The entity's vruntime has been adjusted, so let's check

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ