linux-kernel - Re: [PATCH 10/10] Use safe_apic_wait_icr_idle in __send_IPI_dest_field

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1177505556.19745.16.camel@sebastian.intellilink.co.jp>
Date:	Wed, 25 Apr 2007 21:52:36 +0900
From:	Fernando Luis Vázquez Cao 
	<fernando@....ntt.co.jp>
To:	Andi Kleen <ak@...e.de>
Cc:	"Eric W. Biederman" <ebiederm@...ssion.com>, horms@...ge.net.au,
	kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
	vgoyal@...ibm.com, mbligh@...gle.com,
	Keith Owens <kaos@....com.au>, akpm@...ux-foundation.org
Subject: Re: [PATCH 10/10] Use safe_apic_wait_icr_idle in
	__send_IPI_dest_field - x86_64

On Wed, 2007-04-25 at 14:33 +0200, Andi Kleen wrote:
> On Wednesday 25 April 2007 13:51:12 Fernando Luis Vázquez Cao wrote:
> > Use safe_apic_wait_icr_idle to check ICR idle bit if the vector is
> > NMI_VECTOR to avoid potential hangups in the event of crash when kdump
> > tries to stop the other CPUs.
> 
> But what happens then when this fails? Won't this give another hang?
> Have you tested this?
In kdump the crashing CPU (i.e. the CPU that called crash_kexec) is the
one in charge of rebooting into and executing the dump capture kernel.
But before doing this it attempts to stop the other CPUs sending a IPI
using NMI_VECTOR as the vector. The problem is that sometimes delivery
seems to fail and the crashing CPU gets stuck waiting for the ICR status
bit to be cleared, which will never happen. 

With this patch, when safe_apic_wait_icr_idle times out the CPU will
continue executing and try to hand over control to the dump capture
kernel as usual. After applying this patch I have not seen hangs in the
reboot path to second kernel showing the symptoms mentioned before, but
perhaps I am just being lucky and there is something else going on.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/