[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171227161801.GE1410@arch-chirva.localdomain>
Date: Wed, 27 Dec 2017 11:18:01 -0500
From: Alexandru Chirvasitu <achirvasub@...il.com>
To: Dou Liyang <douly.fnst@...fujitsu.com>
Cc: Thomas Gleixner <tglx@...utronix.de>,
Dexuan Cui <decui@...rosoft.com>, Pavel Machek <pavel@....cz>,
kernel list <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
"Maciej W. Rozycki" <macro@...ux-mips.org>,
Mikael Pettersson <mikpelinux@...il.com>,
Josh Poulson <jopoulso@...rosoft.com>,
"Mihai Costache (Cloudbase Solutions SRL)" <v-micos@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
Marc Zyngier <marc.zyngier@....com>,
"linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
Haiyang Zhang <haiyangz@...rosoft.com>,
Simon Xiao <sixiao@...rosoft.com>,
Saeed Mahameed <saeedm@...lanox.com>,
Jork Loeser <Jork.Loeser@...rosoft.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
"devel@...uxdriverproject.org" <devel@...uxdriverproject.org>,
KY Srinivasan <kys@...rosoft.com>
Subject: Re: PROBLEM: 4.15.0-rc3 APIC causes lockups on Core 2 Duo laptop
As per instructions, I did the following:
(1)
Checked out
464e1d5 Linux 4.15-rc5
(after getting my copy up to date, fetching, pulling ,etc.) and
compiled it as-is. Config attached (the one labeled 'np' for 'no
patch').
Result:
Boot with no extraparameters locks up after login, as before;
apic=debug does not panic, but locks up after login, as before;
noapic logs me in fine, but disables my wired connection (this is also
behaviour I noted previously). Sees the thernet card and brings it up,
but dhclient willt not connect.
(2)
Applied the patch you sent below to 464e1d5; again config attached,
labeled 'p' for 'patch'. I applied it manually because git apply was
giving me errors and I didn't want to hold us back while I debug (or
rather I should say 'learn to use git apply'; first time doing it).
In any case though, the changes were as you indicated. The diff I get
with 'git show' were
---------------------------------------------------------------
irq/matrix: Remove the overused BUGON() in irq_matrix_assign_system()
diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c
index 0ba0dd8..9292d79 100644
--- a/kernel/irq/matrix.c
+++ b/kernel/irq/matrix.c
@@ -143,8 +143,8 @@ void irq_matrix_assign_system(struct irq_matrix *m, unsigned int bit,
BUG_ON(m->online_maps > 1 || (m->online_maps && !replace));
set_bit(bit, m->system_map);
- if (replace) {
- BUG_ON(!test_and_clear_bit(bit, cm->alloc_map));
+
+ if (replace && test_and_clear_bit(bit, cm->alloc_map)){
cm->allocated--;
m->total_allocated--;
}
---------------------------------------------------------------
I deleted / inserted those lines myself, but I do believe they
precisely match what you sent. Compiled and installed the modified kernel.
Result:
*Exactly* as above, on all three attempts (no parameters, 'apic=debug'
and 'noapic').
---
So the patch doesn't seem to have an effect, and the panicking is no
longer happening in 4.15-rc5 anyway (with 'apic=debug' and no patch).
Perhaps if I go back to the original bad commit in that bisect I did
and Apply the patch to *that*.. I'll try, but cannot at this precise
moment. I'll get back in a bit.
On Wed, Dec 27, 2017 at 04:14:23PM +0800, Dou Liyang wrote:
> Hi Alexandru,
>
> At 12/24/2017 04:01 AM, Alexandru Chirvasitu wrote:
> > On Sat, Dec 23, 2017 at 02:32:52PM +0100, Thomas Gleixner wrote:
> > > On Sat, 23 Dec 2017, Dexuan Cui wrote:
> > >
> > > > > From: Alexandru Chirvasitu [mailto:achirvasub@...il.com]
> > > > > Sent: Friday, December 22, 2017 14:29
> > > > >
> > > > > The output of that precise command run just now on a freshly-compiled
> > > > > copy of that commit is attached.
> > > > >
> > > > > On Fri, Dec 22, 2017 at 09:31:28PM +0000, Dexuan Cui wrote:
> > > > > > > From: Alexandru Chirvasitu [mailto:achirvasub@...il.com]
> > > > > > > Sent: Friday, December 22, 2017 06:21
> > > > > > >
> > > > > > > In the absence of logs, the best I can do at the moment is attach a
> > > > > > > picture of the screen I am presented with on the boot
> > > > > > > attempt.
> > > > > > > Alex
> > > > > >
> > > > > > The panic happens in irq_matrix_assign_system+0x4e/0xd0 in your picture.
> > > > > > IMO we should find which line of code causes the panic. I suppose
> > > > > > "objdump -D kernel/irq/matrix.o" can help to do that.
> > > > > >
> > > > > > Thanks,
> > > > > > -- Dexuan
> > > >
> > > > The BUG_ON panic happens at line 147:
> > > > BUG_ON(!test_and_clear_bit(bit, cm->alloc_map));
> > > >
>
> There are 2 bugs in your laptop:
>
> 1. Hard lockups on both CPUs after login
> 2. panic with "apic=debug"
>
> For the 2th bug, please try the following patch(need Thomas confirmation
> :) ) in Linux 4.15-rc5. I think it can fix the panic.
>
> If the 2th bug fixed, let's back to the 1th bug:
>
> Is Linus current head 4.15-rc5 bad as well?
>
> If yes, Please using "apic=debug" and give the dmesg log.
>
> Thanks,
> dou.
>
> ------------------------8<-------------------------------------------
>
> irq/matrix: Remove the overused BUGON() in irq_matrix_assign_system()
>
> Currently, x86 marks the preallocated legacy interrupts when initializing
> IRQ(native_init_IRQ), but will clear them if they are not activated in
> vector_configure_legacy().
>
> So, in irq_matrix_assign_system(), replacing an legacy vector which may
> not allocated in a cpumap->alloc_map[] with a system vector will trigger
> the BUGON();
>
> Remove the BUGON().
>
> Signed-off-by: Dou Liyang <douly.fnst@...fujitsu.com>
> ---
> kernel/irq/matrix.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c
> index 0ba0dd8863a7..876cbeab9ca2 100644
> --- a/kernel/irq/matrix.c
> +++ b/kernel/irq/matrix.c
> @@ -143,11 +143,12 @@ void irq_matrix_assign_system(struct irq_matrix *m,
> unsigned int bit,
> BUG_ON(m->online_maps > 1 || (m->online_maps && !replace));
>
> set_bit(bit, m->system_map);
> - if (replace) {
> - BUG_ON(!test_and_clear_bit(bit, cm->alloc_map));
> +
> + if (replace && test_and_clear_bit(bit, cm->alloc_map)){
> cm->allocated--;
> m->total_allocated--;
> }
> +
> if (bit >= m->alloc_start && bit < m->alloc_end)
> m->systembits_inalloc++;
>
> --
>
>
View attachment "config-np" of type "text/plain" (194513 bytes)
View attachment "config-p" of type "text/plain" (194513 bytes)
Powered by blists - more mailing lists