lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zv2PbgNZPgV-MbB2@linux.ibm.com>
Date: Wed, 2 Oct 2024 23:52:38 +0530
From: Vishal Chourasia <vishalc@...ux.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Mike Galbraith <efault@....de>, linux-kernel@...r.kernel.org,
        Ingo Molnar <mingo@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Juri Lelli <juri.lelli@...hat.com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>,
        Mel Gorman <mgorman@...e.de>, Valentin Schneider <vschneid@...hat.com>,
        luis.machado@....com
Subject: Re: sched/fair: Kernel panics in pick_next_entity

On Wed, Oct 02, 2024 at 10:49:32AM +0200, Peter Zijlstra wrote:
> On Tue, Oct 01, 2024 at 10:30:26AM +0200, Mike Galbraith wrote:
> > On Tue, 2024-10-01 at 00:45 +0530, Vishal Chourasia wrote:
> > > >
> > > for sanity, I ran the workload (kernel compilation) on the base commit
> > > where the kernel panic was initially observed, which resulted in a
> > > kernel panic, along with it couple of warnings where also printed on the
> > > console, and a circular locking dependency warning with it.
> > >
> > > Kernel 6.11.0-kp-base-10547-g684a64bf32b6 on an ppc64le
> > >
> > > ------------[ cut here ]------------
> > >
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 6.11.0-kp-base-10547-g684a64bf32b6 #69 Not tainted
> > > ------------------------------------------------------
> > 
> > ...
> > 
> > > --- interrupt: 900
> > > se->sched_delayed
> > > WARNING: CPU: 1 PID: 27867 at kernel/sched/fair.c:6062 unthrottle_cfs_rq+0x644/0x660
> > 
> > ...that warning also spells eventual doom for the box, here it does
> > anyway, running LTPs cfs_bandwidth01 testcase and hackbench together,
> > box grinds to a halt in pretty short order.
> > 
> 
> Right, I've picked up your patch for sched/urgent. But this does make me
> question Vishal's setup.
> 
> He said all he does is compile a kernel, but afaik no regular setup uses
> CFS bandwidth by default. So something is 'special' at his end that he's
> not been telling us about.
Yes Peter, I'm compiling the kernel from source. While I'm not running the 
compilation within a cgroup that has bandwidth limits set, there are some 
system services running in the background that do have bandwidth 
limitations applied.


# find . -name cpu.max -exec cat {} +
max 100000
max 100000
max 100000
max 100000
max 100000
max 100000
5000 100000
34000 100000
10000 100000
31000 100000
max 100000
max 100000
max 100000
max 100000
max 100000
max 100000

> 
> Vishal, could you expand upon your configuration? How come you're using
> CFS bandwidth, what else is special?
config cfs_bandwidth is enabled by default in both the
pseries_le_defconfig and the distro kernel config I'm using for the
compilation.

Let me know if you need any more info. I hope I have answered your
queries.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ