[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131018173625.GE2277@redhat.com>
Date: Fri, 18 Oct 2013 13:36:25 -0400
From: Vivek Goyal <vgoyal@...hat.com>
To: HATAYAMA Daisuke <d.hatayama@...fujitsu.com>
Cc: hpa@...ux.intel.com, ebiederm@...ssion.com,
kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
bp@...en8.de, akpm@...ux-foundation.org, fengguang.wu@...el.com,
jingbai.ma@...com
Subject: Re: [PATCH v2 2/2] x86, apic: Disable BSP if boot cpu is AP
On Wed, Oct 16, 2013 at 10:26:44AM +0900, HATAYAMA Daisuke wrote:
[..]
> >I am wondering if there is any attribute of cpu which we can pass to
> >second kernel on command line. And tell second kernel not to bring up
> >that specific cpu. (Say exclude_cpu=<cpu_attr>)? If this works, then
> >if ACPI or other mechanism don't report BSP, we could possibly assume
> >that cpu 0 is BSP and ask second kernel to not try to boot it.
> >
>
> I've come up with similar idea. If there's such kernel option, rest of
> the processing can be implemented in user-land, i.e., get apicid of
> BSP from /proc/cpuid and set it in kernel command line of 2nd kernel.
> What kexec-tools should do on fedora/RHEL? Also, this idea covers SFI
> and device tree.
>
> The reason why I didn't choose such idea was first passing the value
> via command-line seems rather ad-hoc.
We do so many things using command line. So telling kernel not to boot
certain cpus seems ok to me.
> The second reason is that in any
> case it's compromised design. Rigorously, we cannot get correct mapping
> of apicid to {BSP, APIC} at the 1st kernel. That is, there's a class of
> the bugs that affect BSP flag of each processor. For example, on
> catastrophic state, all the cpus can have BSP flag on the 2nd kernel due
> to wrmsr instructions generated by the bug causing crash. In this sense,
> current implementation is less reliable than max_cpus=1 case.
>
> If addressing this rigorously, for example, we need to check status of
> BSP flag between 1st kernel and 2nd kernel to keep processor with BSP
> flag unique, exclude cpus in catastrophic state that are not checked,
> and to tell the 2nd kernel which cpu can be wake up.
Ok, for the time being let us not do any comparision with maxcpus=1 or
nr_cpus=1 because we know that's the most robust thing to do.
For the case where we want to bring up more than one cpu in second kernel,
there seems to be two problems.
- ACPI tables or other tables might not report which is BSP. In that
case we might try to bring up BSP and crash the system.
- Due to malicious wrmsr, more than one cpu might claim being BSP. In that
case the cpu we are crashing on will think it is BSP and it can safely
bring up other cpus.
If we start sending a mask of cpus which should not be brought up in
second kernel, then it would not matter whether BSP flag in MSR is set
or not. Isn't it? And that will solve the second issue.
And if ACPI tables don't report which one is BSP, user space can first
try to look at BSP flags of processors (may be this can be reported
in /proc/cpuinfo?) and if no one has BSP flag set, then assume cpu 0
is BSP.
So to me it looks like passing which cpus to not bring up to second kernel
is more resilient approach. Isn't it?
Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists