[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z-MB0Cj4tM6QgOAg@kernel.org>
Date: Tue, 25 Mar 2025 15:19:44 -0400
From: Mike Rapoport <rppt@...nel.org>
To: Frank van der Linden <fvdl@...gle.com>
Cc: Changyuan Lyu <changyuanl@...gle.com>, linux-kernel@...r.kernel.org,
graf@...zon.com, akpm@...ux-foundation.org, luto@...nel.org,
anthony.yznaga@...cle.com, arnd@...db.de, ashish.kalra@....com,
benh@...nel.crashing.org, bp@...en8.de, catalin.marinas@....com,
dave.hansen@...ux.intel.com, dwmw2@...radead.org,
ebiederm@...ssion.com, mingo@...hat.com, jgowans@...zon.com,
corbet@....net, krzk@...nel.org, mark.rutland@....com,
pbonzini@...hat.com, pasha.tatashin@...een.com, hpa@...or.com,
peterz@...radead.org, ptyadav@...zon.de, robh+dt@...nel.org,
robh@...nel.org, saravanak@...gle.com,
skinsburskii@...ux.microsoft.com, rostedt@...dmis.org,
tglx@...utronix.de, thomas.lendacky@....com,
usama.arif@...edance.com, will@...nel.org,
devicetree@...r.kernel.org, kexec@...ts.infradead.org,
linux-arm-kernel@...ts.infradead.org, linux-doc@...r.kernel.org,
linux-mm@...ck.org, x86@...nel.org
Subject: Re: [PATCH v5 07/16] kexec: add Kexec HandOver (KHO) generation
helpers
On Mon, Mar 24, 2025 at 11:40:43AM -0700, Frank van der Linden wrote:
> On Wed, Mar 19, 2025 at 6:56 PM Changyuan Lyu <changyuanl@...gle.com> wrote:
> >
> > From: Alexander Graf <graf@...zon.com>
> >
> > Add the core infrastructure to generate Kexec HandOver metadata. Kexec
> > HandOver is a mechanism that allows Linux to preserve state - arbitrary
> > properties as well as memory locations - across kexec.
> >
> > It does so using 2 concepts:
> >
> > 1) State Tree - Every KHO kexec carries a state tree that describes the
> > state of the system. The state tree is represented as hash-tables.
> > Device drivers can add/remove their data into/from the state tree at
> > system runtime. On kexec, the tree is converted to FDT (flattened
> > device tree).
> >
> > 2) Scratch Regions - CMA regions that we allocate in the first kernel.
> > CMA gives us the guarantee that no handover pages land in those
> > regions, because handover pages must be at a static physical memory
> > location. We use these regions as the place to load future kexec
> > images so that they won't collide with any handover data.
> >
> > Signed-off-by: Alexander Graf <graf@...zon.com>
> > Co-developed-by: Pratyush Yadav <ptyadav@...zon.de>
> > Signed-off-by: Pratyush Yadav <ptyadav@...zon.de>
> > Co-developed-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
> > Co-developed-by: Changyuan Lyu <changyuanl@...gle.com>
> > Signed-off-by: Changyuan Lyu <changyuanl@...gle.com>
> > ---
> > MAINTAINERS | 2 +-
> > include/linux/kexec_handover.h | 109 +++++
> > kernel/Makefile | 1 +
> > kernel/kexec_handover.c | 865 +++++++++++++++++++++++++++++++++
> > mm/mm_init.c | 8 +
> > 5 files changed, 984 insertions(+), 1 deletion(-)
> > create mode 100644 include/linux/kexec_handover.h
> > create mode 100644 kernel/kexec_handover.c
> [...]
> > diff --git a/mm/mm_init.c b/mm/mm_init.c
> > index 04441c258b05..757659b7a26b 100644
> > --- a/mm/mm_init.c
> > +++ b/mm/mm_init.c
> > @@ -30,6 +30,7 @@
> > #include <linux/crash_dump.h>
> > #include <linux/execmem.h>
> > #include <linux/vmstat.h>
> > +#include <linux/kexec_handover.h>
> > #include "internal.h"
> > #include "slab.h"
> > #include "shuffle.h"
> > @@ -2661,6 +2662,13 @@ void __init mm_core_init(void)
> > report_meminit();
> > kmsan_init_shadow();
> > stack_depot_early_init();
> > +
> > + /*
> > + * KHO memory setup must happen while memblock is still active, but
> > + * as close as possible to buddy initialization
> > + */
> > + kho_memory_init();
> > +
> > mem_init();
> > kmem_cache_init();
> > /*
>
>
> Thanks for the work on this.
>
> Obviously it needs to happen while memblock is still active - but why
> as close as possible to buddy initialization?
One reason is to have all memblock allocations done to autoscale the
scratch area. Another reason is to keep memblock structures small as long
as possible as memblock_reserve()ing the preserved memory would quite
inflate them.
And it's overall simpler if memblock only allocates from scratch rather
than doing some of early allocations from scratch and some elsewhere and
still making sure they avoid the preserved ranges.
> Ordering is always a sticky issue when it comes to doing things during
> boot, of course. In this case, I can see scenarios where code that
> runs a little earlier may want to use some preserved memory. The
Can you elaborate about such scenarios?
> current requirement in the patch set seems to be "after sparse/page
> init", but I'm not sure why it needs to be as close as possibly to
> buddy init.
Why would you say that sparse/page init would be a requirement here?
> - Frank
--
Sincerely yours,
Mike.
Powered by blists - more mailing lists