linux-kernel - Re: [PATCH v21 5/7] x86/crash: add x86 crash hotplug support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <fff8c92a-7e82-e917-8925-584664543b90@linux.ibm.com>
Date:   Tue, 2 May 2023 15:06:29 +0530
From:   Hari Bathini <hbathini@...ux.ibm.com>
To:     Eric DeVolder <eric.devolder@...cle.com>,
        Baoquan He <bhe@...hat.com>
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        kexec@...ts.infradead.org, ebiederm@...ssion.com,
        dyoung@...hat.com, vgoyal@...hat.com, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        hpa@...or.com, nramas@...ux.microsoft.com, thomas.lendacky@....com,
        robh@...nel.org, efault@....de, rppt@...nel.org, david@...hat.com,
        sourabhjain@...ux.ibm.com, konrad.wilk@...cle.com,
        boris.ostrovsky@...cle.com
Subject: Re: [PATCH v21 5/7] x86/crash: add x86 crash hotplug support



On 02/05/23 12:03 am, Eric DeVolder wrote:
> 
> 
> On 4/28/23 13:31, Hari Bathini wrote:
>>
>> On 28/04/23 2:55 pm, Baoquan He wrote:
>>> On 04/27/23 at 10:26pm, Hari Bathini wrote:
>>>> On 27/04/23 2:19 pm, Baoquan He wrote:
>>>>> On 04/27/23 at 12:39pm, Hari Bathini wrote:
>>>>>> Hi Eric,
>>>>>>
>>>>>> On 04/04/23 11:33 pm, Eric DeVolder wrote:
>>>>>>> When CPU or memory is hot un/plugged, or off/onlined, the crash
>>>>>>> elfcorehdr, which describes the CPUs and memory in the system,
>>>>>>> must also be updated.
>>>>>>>
>>>>>>> The segment containing the elfcorehdr is identified at run-time
>>>>>>> in crash_core:crash_handle_hotplug_event(), which works for both
>>>>>>> the kexec_load() and kexec_file_load() syscalls. A new elfcorehdr
>>>>>>> is generated from the available CPUs and memory into a buffer,
>>>>>>> and then installed over the top of the existing elfcorehdr.
>>>>>>>
>>>>>>> In the patch 'kexec: exclude elfcorehdr from the segment digest'
>>>>>>> the need to update purgatory due to the change in elfcorehdr was
>>>>>>> eliminated.  As a result, no changes to purgatory or boot_params
>>>>>>> (as the elfcorehdr= kernel command line parameter pointer
>>>>>>> remains unchanged and correct) are needed, just elfcorehdr.
>>>>>>>
>>>>>>> To accommodate a growing number of resources via hotplug, the
>>>>>>> elfcorehdr segment must be sufficiently large enough to accommodate
>>>>>>> changes, see the CRASH_MAX_MEMORY_RANGES description. This is used
>>>>>>> only on the kexec_file_load() syscall; for kexec_load() userspace
>>>>>>> will need to size the segment similarly.
>>>>>>>
>>>>>>> To accommodate kexec_load() syscall in the absence of
>>>>>>
>>>>>> Firstly, thanks! This series is a nice improvement to kdump support
>>>>>> in hotplug environment.
> Thank you!
> 
>>>>>>
>>>>>> One concern though is that this change assumes corresponding support
>>>>>> in kexec-tools. Without that support kexec_load would fail to boot
>>>>>> with digest verification failure, iiuc.
> 
> Yes, you've correctly identified that if a hotplug change occurs 
> following kexec_load
> (made with kexec-tools unaltered for hotplug), then a subsequent panic 
> would in fact
> fail the purgatory digest verification, and  kdump would not happen.
> 
>>>>>
>>>>> Eric has posted patchset to modify kexec_tools to support that, please
>>>>> see the link Eric pasted in the cover letter.
>>>>>
>>>>> http://lists.infradead.org/pipermail/kexec/2022-October/026032.html
>>>>
>>>> Right, Baoquan.
>>>>
>>>> I did see that and if I read the code correctly, without that patchset
>>>> kexec_load would fail. Not with an explicit error that hotplug support
>>>> is missing or such but it would simply fail to boot into capture kernel
>>>> with digest verification failure.
> This is correct.
> 
>>>>
>>>> My suggestion was to avoid that userspace tool breakage for older
>>>> kexec-tools version by introducing a new kexec flag that can tell
>>>> kernel that kexec-tools is ready to use this in-kernel update support.
>>>> So, if kexec_load happens without the flag, avoid doing an in-kernel
>>>> update on hotplug. I hope that clears the confusion.
>>>
>>> Yeah, sounds like a good idea. It may be extended in later patch.
>>
>> Fixing it in this series itself would be a cleaner way, I guess.
> 
> You're suggestion of using a flag makes alot of sense; it is an indication
> to the kernel that it is valid/okay to modify the kexec_load elfcorehdr.
> Only kexec-tools that understands this (meaning the elfcorehdr buffer is
> appropriately sized *and* excludes the elfcorehdr from the purgatory check)
> would set that flag.
> 
> The roll-out of this feature needs to be coordinated, no doubt. There 
> are three
> pieces to this puzzle: this kernel series, the udev rule changes, and 
> the changes
> to kexec-tools for kexec_load.
> 
> I consider the udev rule changes critical to making this feature work 
> efficiently.
> I also think that deploying the udev rules immediately is doable since 
> nothing
> references them, yet; they would be NOPs. And they would be in place 
> when the
> kernel and/or kexec-tool changes deploy.
> 
> However, your point about supporting kexec_load with and without this 
> new flag
> means the sysfs nodes upon which the udev rule change rely need to be a bit
> smarter now. (I'm assuming these udev rules will be generally accepted 
> as-is,
> as they are simple and efficient.)
> 

> The sysfs crash_hotplug nodes need to take into account kexec_file_load vs
> (kexec_load && new_flag). Generally speaking these crash_hotplug sysfs 
> nodes we
> want to be 1 going forward, but where kexec_load/kexec-tools is older 
> and/or no new_flag,
> it needs to be 0. In this way the udev rules can remain as proposed and 
> work properly
> for kexec_file_load and both flavors of kexec_load.

Right. That is the tricky part. kdump scripts and kexec-tools have to
be in sync if udev rules have to just rely on crash_hotplug.

Thanks
Hari