[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <57F3D254.20806@redhat.com>
Date: Tue, 04 Oct 2016 12:01:24 -0400
From: Prarit Bhargava <prarit@...hat.com>
To: Thomas Gleixner <tglx@...utronix.de>
CC: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
Peter Zijlstra <peterz@...radead.org>,
Len Brown <len.brown@...el.com>, Borislav Petkov <bp@...e.de>,
Andi Kleen <ak@...ux.intel.com>, Jiri Olsa <jolsa@...hat.com>,
Juergen Gross <jgross@...e.com>, dyoung@...hat.com,
Eric Biederman <ebiederm@...ssion.com>,
kexec@...ts.infradead.org
Subject: Re: [PATCH] arch/x86: Fix kdump on x86 with physically hotadded CPUs
On 10/04/2016 10:38 AM, Thomas Gleixner wrote:
> On Tue, 4 Oct 2016, Prarit Bhargava wrote:
>> On 10/04/2016 06:58 AM, Thomas Gleixner wrote:
>>> While it is the right thing to initialize the package map in that case, it
>>> still papers over a robustness issue in the uncore code, which needs to be
>>> fixed first.
>>
>> I will include a separate patch with an error check for pkg == 0xffff in the
>> uncore code.
>
> 0xffff? That won't help. The id returned is -1 if the entry is not
> initialized. And aside of that just patching that particular place is not
> helping as the uncore code and also rapl is relying on the package map
> being populated.
Yes, I noticed that after I started digging into it this morning. Not only what
you pointed out but there's init that occurs in the uncore code that would have
to be undone.
There is a similar issue in the rapl code, but that code inadvertently protects
itself with for loops that end up never running (and that's why the rapl code
doesn't panic).
>
> So we need a sanity check in the initialization code which prevents any of
> this being executed.
Ok, should this be done only for logical_proc_id or for logical_proc_id,
phys_proc_id, and cpu_core_id? What do you think of adding that to the end of
smp_init_package_map() or smp_store_cpu_info()?
>
>>>> + if (!num_processors) {
>>>> + pr_warn("CPU 0 not enumerated in mptable or ACPI MADT\n");
>>>> + num_processors = 1;
>>>
>>> And in this case we end up with the same problem, right?
>>
>> It occurs to me that I over thought this: I was thinking that there might exist
>> a pre-ACPI (or at least a system without an MADT) x86 system that wold boot such
>> that num_processors = 0. But in that case, the cpu should be listed in the
>> mptables so the above should not happen. I'll change that to a BUG().
>
> No. That's the wrong thing to do. Think SMP kernel on UP machines ...
>
Sorry Thomas, but my history with real UP hardware is limited. I think you
might be saying I should call generic_processor_info(0, apic_version[0]) to
populate cpu 0 but I'm not sure.
P.
Powered by blists - more mailing lists