[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201201191441.GW3040@hirez.programming.kicks-ass.net>
Date: Tue, 1 Dec 2020 20:14:41 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Mark Rutland <mark.rutland@....com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>,
Christian Borntraeger <borntraeger@...ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Sven Schnelle <svens@...ux.ibm.com>,
Thomas Gleixner <tglx@...utronix.de>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [GIT pull] locking/urgent for v5.10-rc6
On Tue, Dec 01, 2020 at 06:57:37PM +0000, Mark Rutland wrote:
> On Tue, Dec 01, 2020 at 07:15:06PM +0100, Peter Zijlstra wrote:
> > On Tue, Dec 01, 2020 at 03:55:19PM +0100, Peter Zijlstra wrote:
> > > On Tue, Dec 01, 2020 at 06:46:44AM -0800, Paul E. McKenney wrote:
> > >
> > > > > So after having talked to Sven a bit, the thing that is happening, is
> > > > > that this is the one place where we take interrupts with RCU being
> > > > > disabled. Normally RCU is watching and all is well, except during idle.
> > > >
> > > > Isn't interrupt entry supposed to invoke rcu_irq_enter() at some point?
> > > > Or did this fall victim to recent optimizations?
> > >
> > > It does, but the problem is that s390 is still using
> >
> > I might've been too quick there, I can't actually seem to find where
> > s390 does rcu_irq_enter()/exit().
> >
> > Also, I'm thinking the below might just about solve the current problem.
> > The next problem would then be it calling TRACE_IRQS_ON after it did
> > rcu_irq_exit()... :/
>
> I gave this patch a go under QEMU TCG atop v5.10-rc6 s390 defconfig with
> PROVE_LOCKING and DEBUG_ATOMIC_SLEEP. It significantly reduces the
> number of lockdep splats, but IIUC we need to handle the io_int_handler
> path in addition to the ext_int_handler path, and there's a remaining
> lockdep splat (below).
I'm amazed it didn't actually make things worse, given how I failed to
spot do_IRQ() was arch code etc..
> If this ends up looking like we'll need more point-fixes, I wonder if we
> should conditionalise the new behaviour of the core idle code under a
> new CONFIG symbol for now, and opt-in x86 and arm64, then transition the
> rest once they've had a chance to test. They'll still be broken in the
> mean time, but no more so than they previously were.
We can do that I suppose... :/
Powered by blists - more mailing lists