[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <87h80vwta7.fsf@nanos.tec.linutronix.de>
Date: Thu, 16 Jan 2020 12:10:56 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Hsin-Yi Wang <hsinyi@...omium.org>
Cc: Josh Poimboeuf <jpoimboe@...hat.com>,
Ingo Molnar <mingo@...nel.org>,
Peter Zijlstra <peterz@...radead.org>,
Jiri Kosina <jkosina@...e.cz>,
Pavankumar Kondeti <pkondeti@...eaurora.org>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Aaro Koskinen <aaro.koskinen@...ia.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Will Deacon <will@...nel.org>,
Fenghua Yu <fenghua.yu@...el.com>,
James Morse <james.morse@....com>,
Mark Rutland <mark.rutland@....com>,
Heiko Carstens <heiko.carstens@...ibm.com>,
Guenter Roeck <groeck@...omium.org>,
Stephen Boyd <swboyd@...omium.org>,
lkml <linux-kernel@...r.kernel.org>,
"moderated list\:ARM\/FREESCALE IMX \/ MXC ARM ARCHITECTURE"
<linux-arm-kernel@...ts.infradead.org>, linux-csky@...r.kernel.org,
linux-ia64@...r.kernel.org, linux-mips@...r.kernel.org,
linux-parisc@...r.kernel.org,
linuxppc-dev <linuxppc-dev@...ts.ozlabs.org>,
linux-s390@...r.kernel.org,
Linux-sh list <linux-sh@...r.kernel.org>,
sparclinux@...r.kernel.org, linux-xtensa@...ux-xtensa.org,
Linux PM <linux-pm@...r.kernel.org>
Subject: Re: [PATCH v5] reboot: support offline CPUs before reboot
Hsin-Yi Wang <hsinyi@...omium.org> writes:
> On Thu, Jan 16, 2020 at 8:30 AM Thomas Gleixner <tglx@...utronix.de> wrote:
> We saw this issue on regular reboot (not panic) on arm64: If tick
> broadcast and smp_send_stop() happen together and the first broadcast
> arrives to some idled CPU that hasn't already executed reboot ipi to
> run in spinloop, it would try to broadcast to another CPU, but that
> target CPU is already marked as offline by set_cpu_online() in reboot
> ipi, and a warning comes out since tick_handle_oneshot_broadcast()
> would check if it tries to broadcast to offline cpus. Most of the time
> the CPU getting the broadcast interrupt is already in the spinloop and
> thus isn't going to receive interrupts from the broadcast timer.
The timer broadcasting is obviously broken by the existing reboot unplug
mechanism as the outgoing CPU should remove itself from the broadcast.
Just addressing the broadcast issue is not sufficient as there are tons
of other places which rely on consistency of the various cpu masks.
> If system supports hotplug, _cpu_down() would properly handle tasks
> termination such as remove CPU from timer broadcasting by
> tick_offline_cpu()...etc, as well as some interrupt handling.
Well, emphasis on 'if system supports hotplug'. If not, then you are
back to square one. On ARM64 hotplug is selectable by a config option.
So either we mandate HOTPLUG_CPU for SMP and get rid of all the
ifdeffery or we need to have a mechanism which works on !HOTPLUG_CPU as
well.
That whole reboot/shutdown stuff is an unpenetrable mess of notifiers
and architecture hackery, so something generic and understandable is
really required.
Thanks,
tglx
Powered by blists - more mailing lists