linux-kernel - Re: [PATCH RFC] smp: Add cpu unstopped mask for smp_send_stop/stop_other

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <alpine.DEB.2.21.1908201321200.2223@nanos.tec.linutronix.de>
Date:   Tue, 20 Aug 2019 13:24:03 +0200 (CEST)
From:   Thomas Gleixner <tglx@...utronix.de>
To:     Hsin-Yi Wang <hsinyi@...omium.org>
cc:     linux-arm-kernel@...ts.infradead.org,
        Russell King <linux@...linux.org.uk>,
        Catalin Marinas <catalin.marinas@....com>,
        Will Deacon <will@...nel.org>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        "H . Peter Anvin )" <hpa@...or.com>,
        "Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
        Kate Stewart <kstewart@...uxfoundation.org>,
        "David S . Miller" <davem@...emloft.net>,
        Viresh Kumar <viresh.kumar@...aro.org>,
        Marek Szyprowski <m.szyprowski@...sung.com>,
        Arnd Bergmann <arnd@...db.de>, Marc Zyngier <maz@...nel.org>,
        Julien Thierry <julien.thierry.kdev@...il.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Wei Li <liwei391@...wei.com>,
        Anders Roxell <anders.roxell@...aro.org>,
        Rob Herring <robh@...nel.org>,
        Aaro Koskinen <aaro.koskinen@...ia.com>,
        Daniel Thompson <daniel.thompson@...aro.org>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Rik van Riel <riel@...riel.com>,
        Waiman Long <longman@...hat.com>,
        Marcelo Tosatti <mtosatti@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Armijn Hemel <armijn@...ldur.nl>,
        Grzegorz Halat <ghalat@...hat.com>,
        Len Brown <len.brown@...el.com>,
        Shaokun Zhang <zhangshaokun@...ilicon.com>,
        Mike Rapoport <rppt@...ux.vnet.ibm.com>,
        Kees Cook <keescook@...omium.org>,
        Stephen Boyd <swboyd@...omium.org>,
        Guenter Roeck <groeck@...omium.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Alexey Dobriyan <adobriyan@...il.com>,
        Yury Norov <ynorov@...vell.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jiri Kosina <jkosina@...e.cz>,
        Mukesh Ojha <mojha@...eaurora.org>,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] smp: Add cpu unstopped mask for
 smp_send_stop/stop_other_cpus

On Tue, 20 Aug 2019, Hsin-Yi Wang wrote:

> In arm/arm64/x86, reboot IPI function uses CPU online mask to let
> primary CPU know how many secondary CPUs it has to wait for in
> smp_send_stop()/native_stop_other_cpus().
> 
> However, sometimes this would trigger unnecessary warnings, since
> interrupts and tasks might fall on a CPU that has already executed
> the reboot ipi function. This is fine since CPU is already in spinloop.
> But warnings are generated since it finds that the CPU is marked as
> offiline. The warnings are supposed to catch failures in normal hotplug
> offline CPUs, and reboot isn't a regular hotplug. So instead of reusing
> online masks, we should use a new mask in reboot IPI functions to do the
> work.
> 
> Take tick broadcast for example. If broadcast and smp_send_stop()
> happen together, most of the time, the CPU getting earliest broadcast
> is already in spinloop and thus won't do anything. If the first
> broadcast arrives to CPU that hasn't already executed reboot ipi, it
> would try to IPI another CPU, but the CPU is already marked as offline,
> and warning comes out:
> 
> [   22.481523] reboot: Restarting system
> [   22.481608] WARNING: CPU: 4 PID: 0 at ...

That is really the complete wrong approach. There is no valid reason that a
regular reboot needs to use a shortcut homebrewn variant of stopping CPUs.

The proper solution is to restrict this mechansim to emergency reboots and
let the normal reboot go through the regular CPU hotplug mechanism. That
avoids all that duct tape which is just bound to break tomorrow again.

In case of an emergency reboot, we really do not care about any extra stuff
triggered. The machine is hosed already.

Thanks

	tglx