lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241002084932.GN5594@noisy.programming.kicks-ass.net>
Date: Wed, 2 Oct 2024 10:49:32 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: Mike Galbraith <efault@....de>
Cc: Vishal Chourasia <vishalc@...ux.ibm.com>, linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>, luis.machado@....com
Subject: Re: sched/fair: Kernel panics in pick_next_entity

On Tue, Oct 01, 2024 at 10:30:26AM +0200, Mike Galbraith wrote:
> On Tue, 2024-10-01 at 00:45 +0530, Vishal Chourasia wrote:
> > >
> > for sanity, I ran the workload (kernel compilation) on the base commit
> > where the kernel panic was initially observed, which resulted in a
> > kernel panic, along with it couple of warnings where also printed on the
> > console, and a circular locking dependency warning with it.
> >
> > Kernel 6.11.0-kp-base-10547-g684a64bf32b6 on an ppc64le
> >
> > ------------[ cut here ]------------
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 6.11.0-kp-base-10547-g684a64bf32b6 #69 Not tainted
> > ------------------------------------------------------
> 
> ...
> 
> > --- interrupt: 900
> > se->sched_delayed
> > WARNING: CPU: 1 PID: 27867 at kernel/sched/fair.c:6062 unthrottle_cfs_rq+0x644/0x660
> 
> ...that warning also spells eventual doom for the box, here it does
> anyway, running LTPs cfs_bandwidth01 testcase and hackbench together,
> box grinds to a halt in pretty short order.
> 

Right, I've picked up your patch for sched/urgent. But this does make me
question Vishal's setup.

He said all he does is compile a kernel, but afaik no regular setup uses
CFS bandwidth by default. So something is 'special' at his end that he's
not been telling us about.

Vishal, could you expand upon your configuration? How come you're using
CFS bandwidth, what else is special?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ