[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bd974780-5d65-41ef-a94a-adc47cc3a23d@amd.com>
Date: Mon, 9 Feb 2026 22:22:50 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <mingo@...nel.org>, <juri.lelli@...hat.com>, <vincent.guittot@...aro.org>,
<dietmar.eggemann@....com>, <rostedt@...dmis.org>, <bsegall@...gle.com>,
<mgorman@...e.de>, <vschneid@...hat.com>, <linux-kernel@...r.kernel.org>,
<wangtao554@...wei.com>, <quzicheng@...wei.com>, <wuyun.abel@...edance.com>,
<dsmythies@...us.net>
Subject: Re: [PATCH 0/4] sched: Various reweight_entity() fixes
Hello Peter,
On 2/9/2026 9:17 PM, Peter Zijlstra wrote:
> On Wed, Feb 04, 2026 at 03:45:58PM +0530, K Prateek Nayak wrote:
>
>> # Overflow on enqueue
>>
>> <...>-102371 [255] ... : __enqueue_entity: Overflowed cfs_rq:
>> <...>-102371 [255] ... : dump_h_overflow_cfs_rq: cfs_rq: depth(0) weight(90894772) nr_queued(2) sum_w_vruntime(0) sum_weight(0) zero_vruntime(701164930256050) sum_shift(0) avg_vruntime(701809615900788)
>> <...>-102371 [255] ... : dump_h_overflow_entity: se: weight(3508) vruntime(701809615900788) slice(2800000) deadline(701810568648095) curr?(1) task?(1) <-------- cfs_rq->curr
>> <...>-102371 [255] ... : __enqueue_entity: Overflowed se:
>> <...>-102371 [255] ... : dump_h_overflow_entity: se: weight(90891264) vruntime(701808975077099) slice(2800000) deadline(701808975109401) curr?(0) task?(0) <-------- new se
>
> So I spend a whole time trying to reproduce the splat, but alas.
>
> That said, I did spot something 'funny' in the above, note that
> zero_vruntime and avg_vruntime/curr->vruntime are significantly apart.
> That is not something that should happen. zero_vruntime is supposed to
> closely track avg_vruntime.
>
> That lead me to hypothesise that there is a problem tracking
> zero_vruntime when there is but a single runnable task, and sure
> enough, I could reproduce that, albeit not at such a scale as to lead to
> such problems (probably too much noise on my machine).
>
> I ended up with the below; and I've already pushed out a fresh
> queue/sched/core. Could you please test again?
Thank you for looking into this. I'll merge this onto tip:sched/urgent
and take it for a spin overnight like last time with PARANOID_AVG and
see if the sum_shifts increment this time around.
Will report back tomorrow on the status.
As for the patch itself, I'll take a look at it tomorrow - limited by my
drowsiness at the moment :-)
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists