[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2653.69.2.248.210.1218471388.squirrel@webmail.wolfmountaingroup.com>
Date: Mon, 11 Aug 2008 10:16:28 -0600 (MDT)
From: jmerkey@...fmountaingroup.com
To: "Andi Kleen" <andi@...stfloor.org>
Cc: jmerkey@...fmountaingroup.com, "Vivek Goyal" <vgoyal@...hat.com>,
"Andi Kleen" <andi@...stfloor.org>,
"Keith Owens" <kaos@....com.au>, "Jay Lan" <jlan@....com>,
"Christoph Lameter" <cl@...ux-foundation.org>,
"Stefan Richter" <stefanr@...6.in-berlin.de>,
"Nick Piggin" <nickpiggin@...oo.com.au>,
"Geert Uytterhoeven" <geert@...ux-m68k.org>,
"Josh Boyer" <jwboyer@...il.com>, linux-kernel@...r.kernel.org,
"Takenori Nagano" <t-nagano@...jp.nec.com>,
"Bernhard Walle" <bwalle@...e.de>
Subject: Re: [ANNOUNCE] Merkey's Kernel Debugger
> On Mon, Aug 11, 2008 at 07:11:42AM -0600, jmerkey@...fmountaingroup.com
> wrote:
>> I found a problem with APIC NMI support which seems to affect all the
>> debuggers, but appears machine specific -- at least I can reproduce it
>> with all of the modules MDB, KDB, and KGDB modules on my ACER 2410 dual
>
> A couple of laptop BIOS (e.g. some thinkpads) are unfortunately
> not NMI safe. There is no known workaround other than not using NMIs
> on these systems.
>
> There's unfortunately no global blacklist for these systems, although
> having would be useful for a couple of subsystems.
>
> -Andi
>
>
I seem to have nailed down the "voodoo" sequence for reproducing it and
the sequence of failure on the Acer 9410.
Processors 0,1
first set a global breakpoint (schedule) and load registers DR6/DR7
0 -> trigger int1 breakpoint
1 -> trigger int1 breakpoint
0 -> get debugger lock
1 -> spin at debugger lock
0-> NMI all processors but self
1-> gets NMI while spinning at debugger lock
1-> enters NMI code loop and spins
0-> enter debugger console
0-> leave debugger console
0-> release spinning processors
1-> leave NMI code issues IRETD (returns to debugger spinlock and spins)
0-> release debugger lock
1-> get debugger lock
1-> NMI all processors but self
...hard hang in send_IPI_allbutself(APIC_DM_NMI)....
If a delay is placed in the code that calls send_IPI_allbutself() that
waits until processor 0 has left the int1 exception handler and issued an
IRETD, then the hang does not occur. Seems to be the workaround for this
problem.
This problem seems specific to my Acer 9410 laptop, and as you described
seems hardware related, though I am going to attempt to instrument a
workaround for it anyway.
Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists