lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20201201110724.GL3092@hirez.programming.kicks-ass.net>
Date:   Tue, 1 Dec 2020 12:07:24 +0100
From:   Peter Zijlstra <peterz@...radead.org>
To:     Christian Borntraeger <borntraeger@...ibm.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Sven Schnelle <svens@...ux.ibm.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        "Paul E. McKenney" <paulmck@...nel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        the arch/x86 maintainers <x86@...nel.org>
Subject: Re: [GIT pull] locking/urgent for v5.10-rc6

On Tue, Dec 01, 2020 at 09:07:34AM +0100, Peter Zijlstra wrote:
> On Mon, Nov 30, 2020 at 08:31:32PM +0100, Christian Borntraeger wrote:
> > On 30.11.20 19:04, Linus Torvalds wrote:
> > > On Mon, Nov 30, 2020 at 5:03 AM Peter Zijlstra <peterz@...radead.org> wrote:
> > >>
> > >>> But but but...
> > >>>
> > >>>   do_idle()                   # IRQs on
> > >>>     local_irq_disable();      # IRQs off
> > >>>     defaul_idle_call()        # IRQs off
> > >>         lockdep_hardirqs_on();  # IRQs off, but lockdep things they're on
> > >>>       arch_cpu_idle()         # IRQs off
> > >>>         enabled_wait()        # IRQs off
> > >>>         raw_local_save()      # still off
> > >>>         psw_idle()            # very much off
> > >>>           ext_int_handler     # get an interrupt ?!?!
> > >>               rcu_irq_enter()   # lockdep thinks IRQs are on <- FAIL
> > >>
> > >> I can't much read s390 assembler, but ext_int_handler() has a
> > >> TRACE_IRQS_OFF, which would be sufficient to re-align the lockdep state
> > >> with the actual state, but there's some condition before it, what's that
> > >> test and is that right?
> > > 
> > > I think that "psw_idle()" enables interrupts, exactly like x86 does.
> 
> (like ye olde x86, modern x86 idles with interrupts disabled)
> 
> > Yes, by definition.  Otherwise it would be an software error state.
> > The interesting part is the lpswe instruction at the end (load PSW) 
> > which loads the full PSW, which contains interrupt enablement, wait bit,
> > condition code, paging enablement, machine check enablement the address
> > and others. The idle psw is enabled for interrupts and has the wait bit
> > set. If the wait bit is set and interrupts are off this is called "disabled
> > wait" and is used for panic, shutdown etc. 
> 
> OK, but at that point, hardware interrupt state is on, lockdep thinks
> it's on. And we take an interrupt, just like any old regular interrupt
> enabled region.
> 
> But then the exception handler (ext_int_handler), which I'm assuming is
> ran by the hardware with hardware interrupts disabled again, should be
> calling into lockdep to tell interrupts were disabled. IOW that
> TRACE_IRQS_OFF bit in there.
> 
> But that doesn't seem to be working right. Why? Because afaict this is
> then the exact normal flow of things, but it's only going sideways
> during this idle thing.
> 
> What's going 'funny' ?

So after having talked to Sven a bit, the thing that is happening, is
that this is the one place where we take interrupts with RCU being
disabled. Normally RCU is watching and all is well, except during idle.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ