linux-kernel - Re: [patch 61/66] timers: Convert to hotplug state machine

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160725153543.GB10939@linutronix.de>
Date:	Mon, 25 Jul 2016 17:35:43 +0200
From:	rcochran@...utronix.de
To:	Jon Hunter <jonathanh@...dia.com>
Cc:	Anna-Maria Gleixner <anna-maria@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...nel.org>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [patch 61/66] timers: Convert to hotplug state machine

On Mon, Jul 25, 2016 at 03:56:48PM +0100, Jon Hunter wrote:
> > There is a hidden dependency between:
> >
> > - timers
> > - Block multiqueue
> > - rcutree
> >
> > If timers_dead_cpu() comes later than blk_mq_queue_reinit_notify()
> > that latter function causes a RCU stall.
> 
> After this change is applied I am seeing RCU stalls during suspend
> on Tegra. I guess I am hitting the case mentioned above? How should
> this be avoided?

The problem that I had found was a hidden dependency.  When I
initially placed the timers callback into the new HP state list, that
caused a stall because the dependency was broken.  The old code worked
by luck, based on the order of the notifier registrations.  The new
code makes the old implicit ordering explicit, so it should work just
as well as before (famous last words).

> Interestingly I am only seeing the above when using the ARM
> multi_v7_defconfig kernel configuration and not with the tegra_defconfig.
> One key difference between these is that the multi_v7_defconfig does not
> have CONFIG_PREEMPT enabled. Initial testing shows enabling CONFIG_PREEMPT
> for multi_v7_defconfig makes the problem go away.

Just to be sure, this problem didn't exist before the HP rework, that
is, suspend worked fine with and without CONFIG_PREEMPT, right?

I see if I can find a tegra system to test with...

Thanks,
Richard