[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZxcYGIHjFhppy+oA@yzhao56-desk.sh.intel.com>
Date: Tue, 22 Oct 2024 11:12:24 +0800
From: Yan Zhao <yan.y.zhao@...el.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>
CC: <kexec@...ts.infradead.org>, <linux-kernel@...r.kernel.org>,
<linux-coco@...ts.linux.dev>, <x86@...nel.org>, <rick.p.edgecombe@...el.com>,
<kirill.shutemov@...ux.intel.com>
Subject: Re: [PATCH] kexec_core: Accept unaccepted kexec destination addresses
On Mon, Oct 21, 2024 at 09:33:17AM -0500, Eric W. Biederman wrote:
> Yan Zhao <yan.y.zhao@...el.com> writes:
>
> > The kexec destination addresses (incluing those for purgatory, the new
> > kernel, boot params/cmdline, and initrd) are searched from the free area of
> > memblock or RAM resources. Since they are not allocated by the currently
> > running kernel, it is not guaranteed that they are accepted before
> > relocating the new kernel.
> >
> > Accept the destination addresses for the new kernel, as the new kernel may
> > not be able to or may not accept them by itself.
> >
> > Place the "accept" code immediately after the destination addresses pass
> > sanity checks, so the code can be shared by both users of the kexec_load
> > and kexec_file_load system calls.
>
> I am not at all certain this is sufficient, and I am a bit flummoxed
> about the need to ever ``accept'' memory lazily.
>
> In a past life I wrote bootup firmware, and as part of that was the code
> to initialize the contents of memory. When properly tuned and setup it
> would never take more than a second to just blast initial values into
> memory. That is because the ratio of memory per memory controller to
> memory bandwidth stayed roughly constant while I was paying attention.
> I expect that ratio to continue staying roughly constant or systems
> will quickly start developing unacceptable boot times.
>
> As I recall Intel TDX is where the contents of memory are encrypted per
> virtual machine. Which implies that you have the same challenge as
> bootup initializing memory, and that is what ``accepting'' memory is.
Yes, the kernel actually will accept initial memory used by itself in
extract_kernel(), as in arch/x86/boot/compressed/misc.c.
But the target kernel may not be able to accept memory for purgatory.
And it's currently does not accept memory for boot params/cmdline,
and initrd .
>
> I am concerned that an unfiltered accept_memory may result in memory
> that has already been ``accepted'' being accepted again. This has
> the potential to be wasteful in the best case, and the potential to
> cause memory that is in use to be reinitialized losing the values
> that are currently stored there.
accept_memory() will not accept memory that has already been accepted.
An unaccepted->bitmap is maintained and queried before accepting.
(this is at least the implementation in
drivers/firmware/efi/unaccepted_memory.c)
If it's still a concern to you, is it better to add a check like this?
if (range_contains_unaccepted_memory(mstart, size))
accept_memory(mstart, size);
>
> I am concerned that the target kernel won't know about about accepting
> memory, or might not perform the work early enough and try to use memory
> without accepting it first.
The target kernel does accept memory before use it. But not including those
in kexec segments for purgatory, boot params/cmdline, and initrd.
> I would much prefer if getting into kexec_load would force the memory
> acceptance out of lazy mode (or possibly not even work in lazy mode).
> That keeps things simple for now.
>
> Once enough people have machines requiring the use of accept_memory
> we can worry about optimizing things and pushing the accept_memory call
> down into kexec_load.
>
>
> Ugh. I just noticed another issue. Unless the memory we are talking
> about is the memory reserved for kexec on panic kernels the memory needs
> struct pages and everything setup so it can be allocated from anyway.
>
> Which is to say I think this is has the potential to conflict with
> the accounting in try_to_accept_memory.
Then could we put the accept into machine_kexec(), given that accept_memory()
will not fail?
>
> Please just make memory acceptance ``eager'' non-lazy when using kexec.
> Unless someone has messed their implementation badly it won't be a
> significant amount of time in human terms, and it makes the code
> so much easier to understand and think about.
>
yes, it's also an approach if the above cannot convince you.
>
>
> > Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> > Reviewed-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> > Signed-off-by: Yan Zhao <yan.y.zhao@...el.com>
> > ---
> > kernel/kexec_core.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> > index c0caa14880c3..d97376eafc1a 100644
> > --- a/kernel/kexec_core.c
> > +++ b/kernel/kexec_core.c
> > @@ -210,6 +210,16 @@ int sanity_check_segment_list(struct kimage *image)
> > }
> > #endif
> >
> > + /*
> > + * The destination addresses are searched from free memory ranges rather
> > + * than being allocated from the current kernel, so they are not
> > + * guaranteed to be accepted by the current kernel.
> > + * Accept those initial pages for the new kernel since it may not be
> > + * able to accept them by itself.
> > + */
> > + for (i = 0; i < nr_segments; i++)
> > + accept_memory(image->segment[i].mem, image->segment[i].memsz);
> > +
> > return 0;
> > }
> >
> > base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
Powered by blists - more mailing lists