netdev - Re: [PATCH 04/30] firmware: google: Convert regular spinlock into trylock on panic path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <9581851d-6c61-a2ef-a3c4-6e2ce05eab12@igalia.com>
Date:   Wed, 4 May 2022 09:45:31 -0300
From:   "Guilherme G. Piccoli" <gpiccoli@...lia.com>
To:     Evan Green <evgreen@...omium.org>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, bhe@...hat.com,
        pmladek@...e.com, kexec@...ts.infradead.org,
        LKML <linux-kernel@...r.kernel.org>,
        bcm-kernel-feedback-list@...adcom.com,
        linuxppc-dev@...ts.ozlabs.org, linux-alpha@...r.kernel.org,
        linux-edac@...r.kernel.org, linux-hyperv@...r.kernel.org,
        linux-leds@...r.kernel.org, linux-mips@...r.kernel.org,
        linux-parisc@...r.kernel.org, Linux PM <linux-pm@...r.kernel.org>,
        linux-remoteproc@...r.kernel.org, linux-s390@...r.kernel.org,
        linux-tegra@...r.kernel.org, linux-um@...ts.infradead.org,
        linux-xtensa@...ux-xtensa.org, netdev@...r.kernel.org,
        openipmi-developer@...ts.sourceforge.net, rcu@...r.kernel.org,
        sparclinux@...r.kernel.org, xen-devel@...ts.xenproject.org,
        x86@...nel.org, kernel-dev@...lia.com, kernel@...ccoli.net,
        halves@...onical.com, fabiomirmar@...il.com,
        alejandro.j.jimenez@...cle.com,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Arnd Bergmann <arnd@...db.de>, Borislav Petkov <bp@...en8.de>,
        Jonathan Corbet <corbet@....net>, d.hatayama@...fujitsu.com,
        dave.hansen@...ux.intel.com, dyoung@...hat.com,
        feng.tang@...el.com,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        mikelley@...rosoft.com, hidehiro.kawai.ez@...achi.com,
        jgross@...e.com, john.ogness@...utronix.de,
        Kees Cook <keescook@...omium.org>, luto@...nel.org,
        mhiramat@...nel.org, mingo@...hat.com, paulmck@...nel.org,
        peterz@...radead.org, rostedt@...dmis.org,
        senozhatsky@...omium.org, Alan Stern <stern@...land.harvard.edu>,
        Thomas Gleixner <tglx@...utronix.de>, vgoyal@...hat.com,
        vkuznets@...hat.com, Will Deacon <will@...nel.org>,
        Ard Biesheuvel <ardb@...nel.org>,
        David Gow <davidgow@...gle.com>,
        Julius Werner <jwerner@...omium.org>
Subject: Re: [PATCH 04/30] firmware: google: Convert regular spinlock into
 trylock on panic path

On 03/05/2022 18:56, Evan Green wrote:
> Hi Guilherme,
> [...] 
>> Do you agree with that, or prefer really a parameter in
>> gsmi_shutdown_reason() ? I'll follow your choice =)
> 
> I'm fine with either, thanks for the link. Mostly I want to make sure
> other paths to gsmi_shutdown_reason() aren't also converted to a try.

Hi Evan, thanks for the prompt response! So, I'll proceed like I did in
s390, for consistency.

> [...]
>> Reasoning: the problem with your example is that, by default, secondary
>> CPUs are disabled in the panic path, through an IPI mechanism. IPIs take
>> precedence and interrupt the work in these CPUs, effectively
>> interrupting the "polite work" with the lock held heh
> 
> The IPI can only interrupt a CPU with irqs disabled if the IPI is an
> NMI. I haven't looked before to see if we use NMI IPIs to corral the
> other CPUs on panic. On x86, I grepped my way down to
> native_stop_other_cpus(), which looks like it does a normal IPI, waits
> 1 second, then does an NMI IPI. So, if a secondary CPU has the lock
> held, on x86 it has roughly 1s to finish what it's doing and re-enable
> interrupts before smp_send_stop() brings the NMI hammer down. I think
> this should be more than enough time for the secondary CPU to get out
> and release the lock.
> 
> So then it makes sense to me that you're fixing cases where we
> panicked with the lock held, or hung with the lock held. Given the 1
> second grace period x86 gives us, I'm on board, as that helps mitigate
> the risk that we bailed out early with the try and should have spun a
> bit longer instead. Thanks.
> 
> -Evan

Well, in the old path without "crash_kexec_post_notifiers", we indeed
end-up relying on native_stop_other_cpus() for x86 as you said, and the
"1s rule" makes sense. But after this series (or even before, if the
kernel parameter "crash_kexec_post_notifiers" was used) the function
used to stop CPUs in the panic path is crash_smp_send_stop(), and the
call chain is like:

Main CPU:
crash_smp_send_stop()
--kdump_nmi_shootdown_cpus()
----nmi_shootdown_cpus()

Then, in each CPU (except the main one, running panic() path),
we execute kdump_nmi_callback() in NMI context.

So, we seem to indeed interrupt any context (even with IRQs disabled),
increasing the likelihood of the potential lockups due to stopped CPUs
holding the locks heheh

Thanks again for the good discussion, let me know if anything I'm saying
doesn't make sense - this crash path is a bit convoluted, specially in
x86, I might have understood something wrongly =)
Cheers,


Guilherme