lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260211084854.GX1282955@noisy.programming.kicks-ass.net>
Date: Wed, 11 Feb 2026 09:48:54 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: K Prateek Nayak <kprateek.nayak@....com>, mingo@...nel.org,
	juri.lelli@...hat.com, dietmar.eggemann@....com,
	rostedt@...dmis.org, bsegall@...gle.com, mgorman@...e.de,
	vschneid@...hat.com, linux-kernel@...r.kernel.org,
	wangtao554@...wei.com, quzicheng@...wei.com,
	wuyun.abel@...edance.com, dsmythies@...us.net
Subject: Re: [PATCH 0/4] sched: Various reweight_entity() fixes

On Tue, Feb 10, 2026 at 09:52:29PM +0100, Vincent Guittot wrote:

> > Subject: sched/fair: Fix zero_vruntime tracking
> > From: Peter Zijlstra <peterz@...radead.org>
> > Date: Mon Feb  9 15:28:16 CET 2026
> >
> > It turns out that zero_vruntime tracking is broken when there is but a single
> > task running. Current update paths are through __{en,de}queue_entity(), and
> > when there is but a single task, pick_next_task() will always return that one
> > task, and put_prev_set_next_task() will end up in neither function.
> >
> > This can cause entity_key() to grow indefinitely large and cause overflows,
> > leading to much pain and suffering.
> >
> > Furtermore, doing update_zero_vruntime() from __{de,en}queue_entity(), which
> > are called from {set_next,put_prev}_entity() has problems because:
> >
> >  - set_next_entity() calls __dequeue_entity() before it does cfs_rq->curr = se.
> >    This means the avg_vruntime() will see the removal but not current, missing
> >    the entity for accounting.
> >
> >  - put_prev_entity() calls __enqueue_entity() before it does cfs_rq->curr =
> >    NULL. This means the avg_vruntime() will see the addition *and* current,
> >    leading to double accounting.
> >
> > Both cases are incorrect.
> >
> > Noting that avg_vruntime is already called on each {en,de}queue, remove the
> > explicit avg_vruntime() calls (which removes an extra 64bit division for each
> > {en,de}queue) and have avg_vruntime() update zero_vruntime itself.
> >
> > Additionally, have the tick call avg_vruntime() -- discarding the result, but
> > for the side-effect of updating zero_vruntime.
> >
> > While there, optimize avg_vruntime() by noting that the average of one value is
> > rather trivial to compute.
> >
> > Test case:
> >   # taskset -c -p 1 $$
> >   # taskset -c 2 bash -c 'while :; do :; done&'
> >   # cat /sys/kernel/debug/sched/debug  | awk '/^cpu#/ {P=0} /^cpu#2,/ {P=1} {if (P) print $0}' | grep -e zero_vruntime -e "^>"
> >
> > PRE:
> >     .zero_vruntime                 : 31316.407903
> >   >R            bash   487     50787.345112   E       50789.145972           2.800000     50780.298364        16     120         0.000000         0.000000         0.000000        /
> >     .zero_vruntime                 : 382548.253179
> >   >R            bash   487    427275.204288   E      427276.003584           2.800000    427268.157540        23     120         0.000000         0.000000         0.000000        /
> >
> > POST:
> >     .zero_vruntime                 : 17259.709467
> >   >R            bash   526     17259.709467   E       17262.509467           2.800000     16915.031624         9     120         0.000000         0.000000         0.000000        /
> >     .zero_vruntime                 : 18702.723356
> >   >R            bash   526     18702.723356   E       18705.523356           2.800000     18358.045513         9     120         0.000000         0.000000         0.000000        /
> >
> > Fixes: 79f3f9bedd14 ("sched/eevdf: Fix min_vruntime vs avg_vruntime")
> > Reported-by: K Prateek Nayak <kprateek.nayak@....com>
> > Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> 
> Hi Peter,
> 
> This patch w/ the patchset on top of tip/sched/core create regressions
> for hackbench (tbench doesn't seem to be impacted) on my dragonboard
> rb5

Durr... That was obviously not so expected. Let me go poke.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ