[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181031170136.s3ids6tm4rxxlpma@holly.lan>
Date: Wed, 31 Oct 2018 17:01:36 +0000
From: Daniel Thompson <daniel.thompson@...aro.org>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Douglas Anderson <dianders@...omium.org>,
Jason Wessel <jason.wessel@...driver.com>,
kgdb-bugreport@...ts.sourceforge.net, linux-mips@...ux-mips.org,
linux-sh@...r.kernel.org,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Catalin Marinas <catalin.marinas@....com>,
James Hogan <jhogan@...nel.org>, linux-hexagon@...r.kernel.org,
Vineet Gupta <vgupta@...opsys.com>,
Thomas Gleixner <tglx@...utronix.de>,
Philippe Ombredanne <pombredanne@...b.com>,
Kate Stewart <kstewart@...uxfoundation.org>,
Rich Felker <dalias@...c.org>,
Ralf Baechle <ralf@...ux-mips.org>,
linux-snps-arc@...ts.infradead.org,
Yoshinori Sato <ysato@...rs.sourceforge.jp>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
Will Deacon <will.deacon@....com>,
Paul Mackerras <paulus@...ba.org>,
Russell King <linux@...linux.org.uk>,
linux-arm-kernel@...ts.infradead.org,
Christophe Leroy <christophe.leroy@....fr>,
Michael Ellerman <mpe@...erman.id.au>,
Paul Burton <paul.burton@...s.com>,
linux-kernel@...r.kernel.org, Richard Kuo <rkuo@...eaurora.org>,
linuxppc-dev@...ts.ozlabs.org
Subject: Re: [PATCH v2 2/2] kgdb: Fix kgdb_roundup_cpus() for arches who used
smp_call_function()
On Wed, Oct 31, 2018 at 02:49:26PM +0100, Peter Zijlstra wrote:
> On Tue, Oct 30, 2018 at 03:18:43PM -0700, Douglas Anderson wrote:
> > Looking closely at it, it seems like a really bad idea to be calling
> > local_irq_enable() in kgdb_roundup_cpus(). If nothing else that seems
> > like it could violate spinlock semantics and cause a deadlock.
> >
> > Instead, let's use a private csd alongside
> > smp_call_function_single_async() to round up the other CPUs. Using
> > smp_call_function_single_async() doesn't require interrupts to be
> > enabled so we can remove the offending bit of code.
>
> You might want to mention that the only reason this isn't a deadlock
> itself is because there is a timeout on waiting for the slaves to
> check-in.
dbg_master_lock must be owned to call kgdb_roundup_cpus() so
the calls to smp_call_function_single_async() should never deadlock the
calling CPU unless there has been a previous failure to round up (e.g.
cores that cannot react to the round up signal).
When there is a failure to round up when we resume, there is a window (before
whatever locks that prevented the IPI being serviced are released) during which
the system will deadlock if the debugger is re entered.
I don't think there is any point trying to round up a CPU that did not
previously respond... it should still have an IPI pending. The deadlock
can be eliminated by getting the round up code to avoid CPUs whose csd->flags
are non-zero either by checking the flag in the kgdb code or adding something
like smp_trycall_function_single_async().
Daniel.
Powered by blists - more mailing lists