lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260203111134.GL1282955@noisy.programming.kicks-ass.net>
Date: Tue, 3 Feb 2026 12:11:34 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: K Prateek Nayak <kprateek.nayak@....com>
Cc: mingo@...nel.org, juri.lelli@...hat.com, vincent.guittot@...aro.org,
	dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
	mgorman@...e.de, vschneid@...hat.com, linux-kernel@...r.kernel.org,
	wangtao554@...wei.com, quzicheng@...wei.com,
	wuyun.abel@...edance.com, dsmythies@...us.net
Subject: Re: [PATCH 0/4] sched: Various reweight_entity() fixes

On Tue, Feb 03, 2026 at 12:15:56PM +0530, K Prateek Nayak wrote:
> Hello Peter,
> 
> On 1/30/2026 3:04 PM, Peter Zijlstra wrote:
> > Two issues related to reweight_entity() were raised; poking at all that got me
> > these patches.
> > 
> > They're in queue.git/sched/core and I spend most of yesterday staring at traces
> > trying to find anything wrong. So far, so good.
> > 
> > Please test.
> 
> I put this on top of tip:sched/urgent + tip:sched/core which contains Ingo's
> cleanup of removing the union and at some point in the benchmark run I hit:
> 
>     BUG: kernel NULL pointer dereference, address: 0000000000000051

:-(

> 
> so something went sideways with the avg_vruntime calculation I presume.
> I'm rerunning with the PARANOID_AVG feat now.
> 
> Just re-running the particular schbench variant hasn't crashed the kernel
> in the half hour it has been running so I've re-triggered the same set of
> benchmarks to see if flipping PARANOID_AVG makes any difference.

If you run with PARANOID_AVG, the condition ends up visible as:

  grep shift /debug/sched/debug

If any of the fields are !0, you tripped an overflow.

Once its !0, you can't get it back to 0 (except perhaps if its cgroup
things, in which case you can destroy and re-create the cgroups I
suppose) other than reboot.

Anyway, if you can reproduce without PARANOID_AVG (or indeed have
tripped overflow) could you share the specific schbench invocation you
used?

I'm not sure I have valuable tracing patches, I just stick random
trace_printk()s in.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ