[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87frop8r0y.fsf@email.froward.int.ebiederm.org>
Date: Mon, 21 Oct 2024 09:33:17 -0500
From: "Eric W. Biederman" <ebiederm@...ssion.com>
To: Yan Zhao <yan.y.zhao@...el.com>
Cc: kexec@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-coco@...ts.linux.dev, x86@...nel.org, rick.p.edgecombe@...el.com,
kirill.shutemov@...ux.intel.com
Subject: Re: [PATCH] kexec_core: Accept unaccepted kexec destination addresses
Yan Zhao <yan.y.zhao@...el.com> writes:
> The kexec destination addresses (incluing those for purgatory, the new
> kernel, boot params/cmdline, and initrd) are searched from the free area of
> memblock or RAM resources. Since they are not allocated by the currently
> running kernel, it is not guaranteed that they are accepted before
> relocating the new kernel.
>
> Accept the destination addresses for the new kernel, as the new kernel may
> not be able to or may not accept them by itself.
>
> Place the "accept" code immediately after the destination addresses pass
> sanity checks, so the code can be shared by both users of the kexec_load
> and kexec_file_load system calls.
I am not at all certain this is sufficient, and I am a bit flummoxed
about the need to ever ``accept'' memory lazily.
In a past life I wrote bootup firmware, and as part of that was the code
to initialize the contents of memory. When properly tuned and setup it
would never take more than a second to just blast initial values into
memory. That is because the ratio of memory per memory controller to
memory bandwidth stayed roughly constant while I was paying attention.
I expect that ratio to continue staying roughly constant or systems
will quickly start developing unacceptable boot times.
As I recall Intel TDX is where the contents of memory are encrypted per
virtual machine. Which implies that you have the same challenge as
bootup initializing memory, and that is what ``accepting'' memory is.
I am concerned that an unfiltered accept_memory may result in memory
that has already been ``accepted'' being accepted again. This has
the potential to be wasteful in the best case, and the potential to
cause memory that is in use to be reinitialized losing the values
that are currently stored there.
I am concerned that the target kernel won't know about about accepting
memory, or might not perform the work early enough and try to use memory
without accepting it first.
I would much prefer if getting into kexec_load would force the memory
acceptance out of lazy mode (or possibly not even work in lazy mode).
That keeps things simple for now.
Once enough people have machines requiring the use of accept_memory
we can worry about optimizing things and pushing the accept_memory call
down into kexec_load.
Ugh. I just noticed another issue. Unless the memory we are talking
about is the memory reserved for kexec on panic kernels the memory needs
struct pages and everything setup so it can be allocated from anyway.
Which is to say I think this is has the potential to conflict with
the accounting in try_to_accept_memory.
Please just make memory acceptance ``eager'' non-lazy when using kexec.
Unless someone has messed their implementation badly it won't be a
significant amount of time in human terms, and it makes the code
so much easier to understand and think about.
Eric
> Cc: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
> Signed-off-by: Yan Zhao <yan.y.zhao@...el.com>
> ---
> kernel/kexec_core.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index c0caa14880c3..d97376eafc1a 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -210,6 +210,16 @@ int sanity_check_segment_list(struct kimage *image)
> }
> #endif
>
> + /*
> + * The destination addresses are searched from free memory ranges rather
> + * than being allocated from the current kernel, so they are not
> + * guaranteed to be accepted by the current kernel.
> + * Accept those initial pages for the new kernel since it may not be
> + * able to accept them by itself.
> + */
> + for (i = 0; i < nr_segments; i++)
> + accept_memory(image->segment[i].mem, image->segment[i].memsz);
> +
> return 0;
> }
>
> base-commit: 8cf0b93919e13d1e8d4466eb4080a4c4d9d66d7b
Powered by blists - more mailing lists