lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bd974780-5d65-41ef-a94a-adc47cc3a23d@amd.com>
Date: Mon, 9 Feb 2026 22:22:50 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <mingo@...nel.org>, <juri.lelli@...hat.com>, <vincent.guittot@...aro.org>,
	<dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
	<mgorman@...e.de>, <vschneid@...hat.com>, <linux-kernel@...r.kernel.org>,
	<wangtao554@...wei.com>, <quzicheng@...wei.com>, <wuyun.abel@...edance.com>,
	<dsmythies@...us.net>
Subject: Re: [PATCH 0/4] sched: Various reweight_entity() fixes

Hello Peter,

On 2/9/2026 9:17 PM, Peter Zijlstra wrote:
> On Wed, Feb 04, 2026 at 03:45:58PM +0530, K Prateek Nayak wrote:
> 
>>        # Overflow on enqueue
>>
>>            <...>-102371  [255] ... : __enqueue_entity: Overflowed cfs_rq:
>>            <...>-102371  [255] ... : dump_h_overflow_cfs_rq: cfs_rq: depth(0) weight(90894772) nr_queued(2) sum_w_vruntime(0) sum_weight(0) zero_vruntime(701164930256050) sum_shift(0) avg_vruntime(701809615900788)
>>            <...>-102371  [255] ... : dump_h_overflow_entity: se: weight(3508) vruntime(701809615900788) slice(2800000) deadline(701810568648095) curr?(1) task?(1)       <-------- cfs_rq->curr
>>            <...>-102371  [255] ... : __enqueue_entity: Overflowed se:
>>            <...>-102371  [255] ... : dump_h_overflow_entity: se: weight(90891264) vruntime(701808975077099) slice(2800000) deadline(701808975109401) curr?(0) task?(0)   <-------- new se
> 
> So I spend a whole time trying to reproduce the splat, but alas.
> 
> That said, I did spot something 'funny' in the above, note that
> zero_vruntime and avg_vruntime/curr->vruntime are significantly apart.
> That is not something that should happen. zero_vruntime is supposed to
> closely track avg_vruntime.
> 
> That lead me to hypothesise that there is a problem tracking
> zero_vruntime when there is but a single runnable task, and sure
> enough, I could reproduce that, albeit not at such a scale as to lead to
> such problems (probably too much noise on my machine).
> 
> I ended up with the below; and I've already pushed out a fresh
> queue/sched/core. Could you please test again?

Thank you for looking into this. I'll merge this onto tip:sched/urgent
and take it for a spin overnight like last time with PARANOID_AVG and
see if the sum_shifts increment this time around.

Will report back tomorrow on the status.

As for the patch itself, I'll take a look at it tomorrow - limited by my
drowsiness at the moment :-)

-- 
Thanks and Regards,
Prateek


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ