[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24513679-d92d-ead1-2c1d-98db6a9bbdac@oracle.com>
Date: Tue, 31 May 2022 17:25:32 -0500
From: Eric DeVolder <eric.devolder@...cle.com>
To: David Hildenbrand <david@...hat.com>, linux-kernel@...r.kernel.org,
x86@...nel.org, kexec@...ts.infradead.org, ebiederm@...ssion.com,
dyoung@...hat.com, bhe@...hat.com, vgoyal@...hat.com
Cc: tglx@...utronix.de, mingo@...hat.com, bp@...en8.de,
dave.hansen@...ux.intel.com, hpa@...or.com,
nramas@...ux.microsoft.com, thomas.lendacky@....com,
robh@...nel.org, efault@....de, rppt@...nel.org,
konrad.wilk@...cle.com, boris.ostrovsky@...cle.com
Subject: Re: [PATCH v8 3/7] crash: add generic infrastructure for crash
hotplug support
On 5/31/22 08:15, David Hildenbrand wrote:
> On 12.05.22 18:10, Eric DeVolder wrote:
>> David,
>> Great questions! See inline responses below.
>> eric
>
> Sorry for the late reply, travel and vacation ...
No problem, greatly appreciate the feedback!
eric
>
>>>
>>>> +
>>>> +#if defined(CONFIG_HOTPLUG_CPU) || defined(CONFIG_MEMORY_HOTPLUG)
>>>> +void __weak arch_crash_handle_hotplug_event(struct kimage *image,
>>>> + unsigned int hp_action, unsigned int cpu)
>>>> +{
>>>> + WARN(1, "crash hotplug handler not implemented");
>>>
>>>
>>> Won't that trigger on any arch that has CONFIG_HOTPLUG_CPU and CONFIG_MEMORY_HOTPLUG?
>>> I mean, you only implement it for x86 later in this series. Or what else stops this WARN from
>>> triggering?
>>>
>> You're correct. What about: printk_once(KERN_DEBUG "...") ?
>
> Why even bother about printing anything? If the feature is not
> supported, there should be some way for user space to figure out that it
> sill has to reload on hot(un)plug manually, no?
I've changed this to WARN_ONCE(). If that isn't agreeable, I'll remove it.
>
>
> [...]
>
>>
>>>
>>> 2. Why can't the unprotect+reprotect not be done inside
>>> arch_crash_handle_hotplug_event() ? It's all arch specific either way.
>>>
>>> IMHO, this code here should be as simple as
>>>
>>> if (kexec_crash_image)
>>> arch_crash_handle_hotplug_event(kexec_crash_image, hp_action, cpu);
>>>
>>
>> The intent of this code was to be generic infrastructure. Just invoking the
>> arch_crash_handle_hotplug_event() would certainly be as generic as it gets.
>> But there were a series of steps that seemed to be common, so those I hoisted
>> into this bit of code.
>
> But most common parts are actually arch_* calls already? :)
>
> Anyhow, no strong opinion.
For the time being, I'd like to leave as is. Let's see if the detection of the elfcorehdr
segment gets moved to this code too... (discussion thread on "x86/crash: Add x86 crash hotplug
support for kexec_load".
>
>>
>>> 3. Why do we have to forward the CPU for CPU onlining/offlining but not the
>>> memory block id (or similar) when onlining/offlining a memory block?
>> From patch "kexec: exclude hot remove cpu from elfcorehdr notes" commit message:
>>
>> Due to use of CPUHP_AP_ONLINE_DYN, upon CPU unplug, the CPU is
>> still in the for_each_present_cpu() list when within the
>> handle_hotplug_event(). Thus the CPU must be explicitly excluded
>> when building the new list of CPUs.
>>
>> This change identifies in handle_hotplug_event() the CPU to be
>> excluded, and the check for excluding the CPU in
>> crash_prepare_elf64_headers().
>>
>> If there is a better CPUHP_ to use than _DYN, I'd be all for that!
>
> Ah okay, thanks.
>
Powered by blists - more mailing lists