linux-kernel - Re: [PATCH v4 0/3] x86, apic, kexec: Add disable_cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <527C64E2.4000501@jp.fujitsu.com>
Date:	Fri, 08 Nov 2013 13:13:22 +0900
From:	HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
To:	Baoquan He <bhe@...hat.com>
CC:	hpa@...ux.intel.com, ebiederm@...ssion.com, vgoyal@...hat.com,
	kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
	bp@...en8.de, akpm@...ux-foundation.org, fengguang.wu@...el.com,
	jingbai.ma@...com
Subject: Re: [PATCH v4 0/3] x86, apic, kexec: Add disable_cpu_apic kernel
 parameter

(2013/11/08 12:30), Baoquan He wrote:
> Hi,
>
> Reccently people reported kexec didn't work correctly. After check, it's
> a regression. Since a code block which migrate current thread to cpu0
> when executing "kexec -e", this can be reproduced by setting affinity to
> CPUn(n!=0). You can find this patch in this link:
> https://lkml.org/lkml/2013/11/5/88
>
> Then I thought why we don't do this in kdump. I tried migrating current
> thread to cpu0 when crash happened, it works very well. Set affinity to
> make crash happened on CPUn(n!=0), then all cpus can be brought up and
> dump is successful. I pasted the patch as below.
>
> Only one thing worried me, whether the context related to crash cpu will
> be different, and do we care which cpu crashed. If it need be cared, or
> it doesn't involve difference, That will be great. Multiple CPUs can be
> supported easily in this simpler way. Meanwhile, this patch just try to
> migrate, if it's failed, we can avoid to bring up bsp.
>
> Watch do you think about it?
>

We have already discussed this idea. It's the idea of my first patch and
it was nacked. See the following url. (Sorry, I removed explanation of
development history from patch description at v4 patch, but I've planned
to write what ideas doesn't work well in documentation of this work.)

https://lkml.org/lkml/2012/4/15/181

The key reason why we cannot do that is the environment we are running
must be considered broken. Either interrupts or scheduler could no longer
work. Tables for interrupts can be broken. The other cpus except for the
crashing cpu are no longer guaranteed to be running sanely. Migrating cpu
from the crashing cpu to cpu0 reduces reliability of kdump.

-- 
Thanks.
HATAYAMA, Daisuke

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/