[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <DM4PR11MB6020E10A34536D3A3216248CA0182@DM4PR11MB6020.namprd11.prod.outlook.com>
Date: Tue, 14 Jan 2025 03:26:19 +0000
From: "Chen, Yu C" <yu.c.chen@...el.com>
To: 'Dave Young' <dyoung@...hat.com>
CC: "x86@...nel.org" <x86@...nel.org>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "kexec@...ts.infradead.org"
<kexec@...ts.infradead.org>, Ingo Molnar <mingo@...hat.com>, Borislav Petkov
<bp@...en8.de>, "Rafael J . Wysocki" <rafael@...nel.org>, Pavel Machek
<pavel@....cz>
Subject: RE: [PATCH] x86/e820: update code comment about e820_table_kexec
Hi Dave,
Thanks for bringing this up,
> -----Original Message-----
> From: Dave Young <dyoung@...hat.com>
> Sent: Friday, January 10, 2025 2:30 PM
> To: Chen, Yu C <yu.c.chen@...el.com>
> Cc: x86@...nel.org; linux-kernel@...r.kernel.org; kexec@...ts.infradead.org;
> Ingo Molnar <mingo@...hat.com>; Borislav Petkov <bp@...en8.de>
> Subject: Re: [PATCH] x86/e820: update code comment about
> e820_table_kexec
>
> > BTW, the conversation below drived me to read the e820 code:
> > https://lore.kernel.org/all/CAMj1kXG1hbiafKRyC5qM1Vj5X7x-
> dmLndqqo2AYnH
> > MRxDz-80w@...l.gmail.com/T/#u
> >
> > It could be better to clean up the e820 tables, anyway the comment fix
> > in this patch itself is good for now.
> >
> > Basically e820_table_firmware is used by kexec-tools kexec_load
> > implementation, e820_table_kexec is used by kexec_file_load code to
> > pass to the 2nd kernel in boot params.
> >
> > The e820_table_firmware is said to not be modified by the kernel in
> > the code comment, but this is not true, at least the sev code updates
> > the table. The hibernate code generates crc32 checksum and verifies
> > it, but since AFAIK e820 table update only happens in boot phase, it
> > will be stable on runtime. So we can just use e820_table_firmware for
> > kexec use and drop the e820_table_kexec. With the change the
> > kexec_file_load and kexec_load see the same memory ranges.
> >
> > Otherwise I thought we can use just one e820 table, dropping
> > e820_table_firmware and e820_table_kexec, but then there will be
> > fragments and memory waste due to the setup_data ranges are reserved
> > and updated in e820_table, so the e820_table_firmware being still be
> > used for kexec makes sense.
> >
> > Anyway I need to think more about it, please let me know if you have
> > any concerns.
>
> Reread the old commit 12df216c61c89e31e27e74146115a9728880ca6f
> x86/boot/e820: Introduce the bootloader provided e820_table_firmware[]
> table
>
> It seems the e820_table_firmware was changed to be exported in sysfs
> instead of e820_table_kexec, but I suspect this is wrong.
> kexec-tools (kexec_load) use the exported sysfs memmap info, but
> e820_table_firmware was created by above commit to be used by hibernation
> and the table should not be changed, the fact is there are changes happen
> from time to time.
>
This is not expected from the original introducing of e820_table_firmware, it
should not be changed by OS. I suppose the changes to e820_table_firmware
are because of the kexec requirements for /sys/firmware/memmap?
Another question is that, why does kexec_load() get the memory layout
from /sys/firmware/memmap, but kexec_file_load() relies on the in-kernel
e820_table_kexec?
> Question is does hibernation use the sysfs entries from its userspace tools?
Hibernation does not use this sysfs entries in userspace(or uswsusp )as far
as I know.
> If yes, then we should have both kexec table and firmware table exported
> because they are for different purposes and one may change, another not.
>
> If hibernation only uses the table within the kernel then it makes no sense to
> export it to sysfs, we should only export the kexec table for kexec-tools use.
> In this way both kexec_load (userspace) and kexec_file_load (kernel load) can
> use same e820 table, it will reduce the bugs and be easier to maintain.
I'm OK with not exposing e820_table_firmware to /sys/firmware/memmap, if
kexec is the only user of /sys/firmware/memmap.
Thanks,
Chenyu
Powered by blists - more mailing lists