linux-kernel - Re: [PATCH v2 1/1] ppc/crash: Skip spinlocks during crash

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <f4447980747f4633ae720825128663bda0101188.camel@linux.ibm.com>
Date:   Fri, 27 Mar 2020 15:15:12 -0300
From:   Leonardo Bras <leonardo@...ux.ibm.com>
To:     Michael Ellerman <mpe@...erman.id.au>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Benjamin Herrenschmidt <benh@...nel.crashing.org>,
        Paul Mackerras <paulus@...ba.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Alexios Zavras <alexios.zavras@...el.com>,
        Christophe Leroy <christophe.leroy@....fr>
Cc:     linuxppc-dev@...ts.ozlabs.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] ppc/crash: Skip spinlocks during crash

Hello Michael,

On Fri, 2020-03-27 at 14:50 +1100, Michael Ellerman wrote:
> Hi Leonardo,
> 
> Leonardo Bras <leonardo@...ux.ibm.com> writes:
> > During a crash, there is chance that the cpus that handle the NMI IPI
> > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
> > will cause a deadlock. (rtas_lock and printk logbuf_log as of today)
> 
> Please give us more detail on how those locks are causing you trouble, a
> stack trace would be good if you have it.

Sure, I have hit it in printf and rtas_call, as said before. 

After crash_send_ipi(), it's tested how many cpus_in_crash are there,
and once they hit the total value, it's printed "IPI complete". This
printk call itself already got stuck in spin_lock, for example.

Here are the stack traces:

#0  arch_spin_lock 
#1  do_raw_spin_lock 
#2  __raw_spin_lock 
#3  _raw_spin_lock 
#4  vprintk_emit 
#5  vprintk_func
#7  crash_kexec_prepare_cpus 
#8  default_machine_crash_shutdown
#9  machine_crash_shutdown 
#10 __crash_kexec
#11 crash_kexec
#12 oops_end

#0 arch_spin_lock
#1  lock_rtas () 
#2  rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
#3  ics_rtas_mask_real_irq (hw_irq=4100) 
#4  machine_kexec_mask_interrupts
#5  default_machine_crash_shutdown
#6  machine_crash_shutdown 
#7  __crash_kexec
#8  crash_kexec
#9  oops_end

> > This is a problem if the system has kdump set up, given if it crashes
> > for any reason kdump may not be saved for crash analysis.
> > 
> > Skip spinlocks after NMI IPI is sent to all other cpus.
> 
> We don't want to add overhead to all spinlocks for the life of the
> system, just to handle this one case.

I understand. 
Other than this patch, I would propose doing something uglier, like
forcing the said locks to unlocked state when cpus_in_crash hits it's
maximum value, before printing "IPI complete".
Creating similar functions that don't lock, just for this case, looks
like overkill to me.

Do you have any other suggestion?

> There's already a flag that is set when the system is crashing,
> "oops_in_progress", maybe we need to use that somewhere to skip a lock
> or do an early return.

I think that would not work, because oops_in_progress should be 0 here:
oops_end() calls bust_spinlocks(0) before calling crash_kexec(), and
bust_spinlocks(0) will decrement oops_in_progress.
(just verified, it's 0 before printing "IPI complete").

Thank you the feedback, :)

Best regards,
Leonardo


Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)