linux-kernel - Re: [PATCH 1/1] ppc/crash: Skip spinlocks during crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200330110231.GG20696@hirez.programming.kicks-ass.net>
Date:   Mon, 30 Mar 2020 13:02:31 +0200
From:   Peter Zijlstra <peterz@...radead.org>
To:     Christophe Leroy <christophe.leroy@....fr>
Cc:     Leonardo Bras <leonardo@...ux.ibm.com>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Michael Ellerman <mpe@...erman.id.au>,
        Enrico Weigelt <info@...ux.net>,
        Allison Randal <allison@...utok.net>,
        Thomas Gleixner <tglx@...utronix.de>,
        linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH 1/1] ppc/crash: Skip spinlocks during crash

On Fri, Mar 27, 2020 at 07:50:20AM +0100, Christophe Leroy wrote:
> 
> 
> Le 26/03/2020 à 23:28, Leonardo Bras a écrit :
> > During a crash, there is chance that the cpus that handle the NMI IPI
> > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
> > will cause a deadlock. (rtas_lock and printk logbuf_log as of today)
> > 
> > This is a problem if the system has kdump set up, given if it crashes
> > for any reason kdump may not be saved for crash analysis.
> > 
> > Skip spinlocks after NMI IPI is sent to all other cpus.
> > 
> > Signed-off-by: Leonardo Bras <leonardo@...ux.ibm.com>
> > ---
> >   arch/powerpc/include/asm/spinlock.h | 6 ++++++
> >   arch/powerpc/kexec/crash.c          | 3 +++
> >   2 files changed, 9 insertions(+)
> > 
> > diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h
> > index 860228e917dc..a6381d110795 100644
> > --- a/arch/powerpc/include/asm/spinlock.h
> > +++ b/arch/powerpc/include/asm/spinlock.h
> > @@ -111,6 +111,8 @@ static inline void splpar_spin_yield(arch_spinlock_t *lock) {};
> >   static inline void splpar_rw_yield(arch_rwlock_t *lock) {};
> >   #endif
> > +extern bool crash_skip_spinlock __read_mostly;
> > +
> >   static inline bool is_shared_processor(void)
> >   {
> >   #ifdef CONFIG_PPC_SPLPAR
> > @@ -142,6 +144,8 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
> >   		if (likely(__arch_spin_trylock(lock) == 0))
> >   			break;
> >   		do {
> > +			if (unlikely(crash_skip_spinlock))
> > +				return;
> 
> You are adding a test that reads a global var in the middle of a so hot path
> ? That must kill performance. Can we do different ?

This; adding code to a super hot patch like this for an exceptional case
like the crash handling seems like a very very bad trade to me.

One possible solution is to simply write 0 to the affected spinlocks
after sending the NMI IPI thing, no?