[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7b25449b-72e9-3eca-73a9-592d7400b746@oracle.com>
Date: Tue, 18 Apr 2023 08:55:51 -0500
From: Eric DeVolder <eric.devolder@...cle.com>
To: Baoquan He <bhe@...hat.com>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
kexec@...ts.infradead.org, ebiederm@...ssion.com,
dyoung@...hat.com, vgoyal@...hat.com, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
hpa@...or.com, nramas@...ux.microsoft.com, thomas.lendacky@....com,
robh@...nel.org, efault@....de, rppt@...nel.org, david@...hat.com,
sourabhjain@...ux.ibm.com, konrad.wilk@...cle.com,
boris.ostrovsky@...cle.com
Subject: Re: [PATCH v21 2/7] crash: add generic infrastructure for crash
hotplug support
On 4/6/23 18:58, Baoquan He wrote:
> On 04/06/23 at 11:10am, Eric DeVolder wrote:
>>
>>
>> On 4/6/23 06:04, Baoquan He wrote:
>>> On 04/04/23 at 02:03pm, Eric DeVolder wrote:
>>> ......
>>>> +static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
>>>> +{
>>>> + struct kimage *image;
>>>> +
>>>> + /* Obtain lock while changing crash information */
>>>> + if (!kexec_trylock()) {
>>>> + pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
>>>> + return;
>>>> + }
>>>> +
>>>> + /* Check kdump is not loaded */
>>>> + if (!kexec_crash_image)
>>>> + goto out;
>>>> +
>>>> + image = kexec_crash_image;
>>>> +
>>>> + if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||
>>>> + hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
>>>> + pr_debug("hp_action %u, cpu %u\n", hp_action, cpu);
>>>> + else
>>>> + pr_debug("hp_action %u\n", hp_action);
>>>
>>> Seems we passed in the cpu number just for printing here. Wondering why
>>> we don't print out hot added/removed memory ranges. Is the cpu number
>>> printing necessary?
>>>
>> Baoquan,
>>
>> Ah, actually until recently it was used to track the 'offlinecpu' in this
>> function, but tglx pointed out that was un-necessary. That resulted in
>> dropping the code in this function dealing with offlinecpu, leaving this as
>> its only use in this function.
>>
>> The printing of cpu number is not necessary, but helpful; I use it for debugging.
>
> OK, I see. I am not requesting memory range printing, just try to prove
> cpu number printing is not so justified. If it's helpful, I am OK with
> it. Let's see if other people have concern about this.
>
I do not plan on adding the memory range printing.
>>
>> The printing of memory range is also not necessary, but in order to do that,
>> should we choose to do so, requires passing in the memory range to this
>> function. This patch series did do this early on, and by v7 I dropped it at
>> your urging (https://lore.kernel.org/lkml/20220401183040.1624-1-eric.devolder@oracle.com/).
>> At the time, I provided it since I considered this generic infrastructure,
>> but I could not defend it since x86 didn't need it. However, PPC now needs
>> this, and is now carrying this as part of PPC support of CRASH_HOTPLUG (https://lore.kernel.org/linuxppc-dev/20230312181154.278900-6-sourabhjain@linux.ibm.com/T/#u).
>>
>> If you'd rather I pickup the memory range handling again, I can do that. I
>> think I'd likely change this function to be:
>>
>> void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu,
>> struct memory_notify *mhp);
>>
>> where on a CPU op the 'cpu' parameter would be valid and 'mhp' NULL, and on a memory op,
>> the 'mhp' would be valid and 'cpu' parameter invalid(0).
>>
>> I'd likely then stuff these two parameters into struct kimage so that it can
>> be utilized by arch-specific handler, if needed.
>>
>> And of course, would print out the memory range for debug purposes.
>>
>> Let me know what you think.
>
I do not plan on adding the memory range handling; I'll let Sourabh do that as he has a use case for it.
As such, I don't see any other request for changes.
Thanks!
eric
Powered by blists - more mailing lists