lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sat, 10 Sep 2016 03:19:38 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Rich Felker <dalias@...c.org>
Cc:     linux-kernel@...r.kernel.org, john.stultz@...aro.org,
        tglx@...utronix.de
Subject: Re: rcu_sched stalls in idle task introduced in pre-4.8?

On Thu, Sep 08, 2016 at 06:16:53PM -0400, Rich Felker wrote:
> On Wed, Aug 03, 2016 at 09:16:31AM -0700, Paul E. McKenney wrote:
> > On Tue, Aug 02, 2016 at 01:45:04PM -0700, Paul E. McKenney wrote:
> > > On Tue, Aug 02, 2016 at 04:32:17PM -0400, Rich Felker wrote:
> > > > On Tue, Aug 02, 2016 at 12:48:02PM -0700, Paul E. McKenney wrote:
> > 
> > [ . . . ]
> > 
> > > > > Does the problem reproduces easily?
> > > > 
> > > > Yes, it happens right after boot and repeats every 30-90 seconds or
> > > > so.
> > > 
> > > Well, that at least makes it easier to test any patches!
> > > 
> > > > > A bisection might be very helpful.
> > > > 
> > > > Bisection would require some manual work to setup because the whole
> > > > reason I was rebasing on Linus's tree was to adapt the drivers to
> > > > upstream infrastructure changes (the new cpuhp stuff replacing
> > > > notifier for cpu starting). The unfortunate way it was done, each
> > > > driver adds an enum to linux/cpuhotplug.h so all the patches have
> > > > gratuitous conflicts. In addition, for older revisions in Linus's
> > > > tree, there's at least one show-stopping (hang during boot) bug that
> > > > needs a cherry-pick to fix. There may be other small issues too. I
> > > > don't think they're at all insurmountible but it requires an annoying
> > > > amount of scripting.
> > > 
> > > I had to ask!  Might eventually be necessary, but let's see what we
> > > can learn from what you currently have.
> > 
> > And at first glance, my overnight run looks uglier than I would expect.
> > I am now running tests at v4.7, and will run other tests to see if
> > there really is a statistically significant degradation.  If there is,
> > then I might be able to bisect, though with nine-hour runs this could
> > take quite some time.
> 
> Any more thoughts on this? I'm testing v4.8-rc5 (plus jcore drivers
> not yet upstream) and it's still happening.

Not seeing it, but please do send me a recent splat from your dmesg and
your .config.

Because I am not seeing it, I also suggest inspecting your jcore drivers
with the information in Documentation/RCU/stallwarn.txt in mind.

								Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ