[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201202075427.GA5838@osiris>
Date: Wed, 2 Dec 2020 08:54:27 +0100
From: Heiko Carstens <hca@...ux.ibm.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Sven Schnelle <svens@...ux.ibm.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
"Paul E. McKenney" <paulmck@...nel.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [GIT pull] locking/urgent for v5.10-rc6
> > But but but...
> >
> > do_idle() # IRQs on
> > local_irq_disable(); # IRQs off
> > defaul_idle_call() # IRQs off
> lockdep_hardirqs_on(); # IRQs off, but lockdep things they're on
> > arch_cpu_idle() # IRQs off
> > enabled_wait() # IRQs off
> > raw_local_save() # still off
> > psw_idle() # very much off
> > ext_int_handler # get an interrupt ?!?!
> rcu_irq_enter() # lockdep thinks IRQs are on <- FAIL
>
> I can't much read s390 assembler, but ext_int_handler() has a
> TRACE_IRQS_OFF, which would be sufficient to re-align the lockdep state
> with the actual state, but there's some condition before it, what's that
> test and is that right?
After digging a bit into our asm code: no, it is not right, and only
for psw_idle() it is wrong.
What happens with the current code:
- default_idle_call() calls lockdep_hardirqs_on() before calling into
arch_cpu_idle()
- our arch_cpu_idle() calls psw_idle() which enables irqs. the irq
handler will call/use the SWITCH_ASYNC macro which clears the
interrupt enabled bits in the old program status word (_only_ for
psw_idle)
- this again causes the interrupt handler to _not_ call TRACE_IRQS_OFF
and therefore lockdep thinks interrupts are enabled within the
interrupt handler
So I guess my patch which I sent yesterday evening should fix all that
mess - plus an explicit trace_hardirqs_off() call in our udelay
implementation is required now.
I'll send a proper patch later.
Powered by blists - more mailing lists