lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bA2qfLF1Mbyvnat+L9+5KAw6LnhYETXVoYcMGJxwTGahg@mail.gmail.com>
Date: Mon, 6 Oct 2025 13:21:20 -0400
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Pratyush Yadav <pratyush@...nel.org>
Cc: jasonmiu@...gle.com, graf@...zon.com, changyuanl@...gle.com, 
	rppt@...nel.org, dmatlack@...gle.com, rientjes@...gle.com, corbet@....net, 
	rdunlap@...radead.org, ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com, 
	ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org, 
	akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr, 
	mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com, 
	axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com, 
	vincent.guittot@...aro.org, hannes@...xchg.org, dan.j.williams@...el.com, 
	david@...hat.com, joel.granados@...nel.org, rostedt@...dmis.org, 
	anna.schumaker@...cle.com, song@...nel.org, zhangguopeng@...inos.cn, 
	linux@...ssschuh.net, linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, 
	linux-mm@...ck.org, gregkh@...uxfoundation.org, tglx@...utronix.de, 
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org, 
	hpa@...or.com, rafael@...nel.org, dakr@...nel.org, 
	bartosz.golaszewski@...aro.org, cw00.choi@...sung.com, 
	myungjoo.ham@...sung.com, yesanishhere@...il.com, Jonathan.Cameron@...wei.com, 
	quic_zijuhu@...cinc.com, aleksander.lobakin@...el.com, ira.weiny@...el.com, 
	andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de, 
	bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com, 
	stuart.w.hayes@...il.com, lennart@...ttering.net, brauner@...nel.org, 
	linux-api@...r.kernel.org, linux-fsdevel@...r.kernel.org, saeedm@...dia.com, 
	ajayachandra@...dia.com, jgg@...dia.com, parav@...dia.com, leonro@...dia.com, 
	witu@...dia.com, hughd@...gle.com, skhawaja@...gle.com, chrisl@...nel.org, 
	steven.sistare@...cle.com
Subject: Re: [PATCH v4 03/30] kho: drop notifiers

On Mon, Oct 6, 2025 at 1:01 PM Pratyush Yadav <pratyush@...nel.org> wrote:
>
> On Mon, Sep 29 2025, Pasha Tatashin wrote:
>
> > From: "Mike Rapoport (Microsoft)" <rppt@...nel.org>
> >
> > The KHO framework uses a notifier chain as the mechanism for clients to
> > participate in the finalization process. While this works for a single,
> > central state machine, it is too restrictive for kernel-internal
> > components like pstore/reserve_mem or IMA. These components need a
> > simpler, direct way to register their state for preservation (e.g.,
> > during their initcall) without being part of a complex,
> > shutdown-time notifier sequence. The notifier model forces all
> > participants into a single finalization flow and makes direct
> > preservation from an arbitrary context difficult.
> > This patch refactors the client participation model by removing the
> > notifier chain and introducing a direct API for managing FDT subtrees.
> >
> > The core kho_finalize() and kho_abort() state machine remains, but
> > clients now register their data with KHO beforehand.
> >
> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@...nel.org>
> > Signed-off-by: Pasha Tatashin <pasha.tatashin@...een.com>
> [...]
> > diff --git a/mm/memblock.c b/mm/memblock.c
> > index e23e16618e9b..c4b2d4e4c715 100644
> > --- a/mm/memblock.c
> > +++ b/mm/memblock.c
> > @@ -2444,53 +2444,18 @@ int reserve_mem_release_by_name(const char *name)
> >  #define MEMBLOCK_KHO_FDT "memblock"
> >  #define MEMBLOCK_KHO_NODE_COMPATIBLE "memblock-v1"
> >  #define RESERVE_MEM_KHO_NODE_COMPATIBLE "reserve-mem-v1"
> > -static struct page *kho_fdt;
> > -
> > -static int reserve_mem_kho_finalize(struct kho_serialization *ser)
> > -{
> > -     int err = 0, i;
> > -
> > -     for (i = 0; i < reserved_mem_count; i++) {
> > -             struct reserve_mem_table *map = &reserved_mem_table[i];
> > -             struct page *page = phys_to_page(map->start);
> > -             unsigned int nr_pages = map->size >> PAGE_SHIFT;
> > -
> > -             err |= kho_preserve_pages(page, nr_pages);
> > -     }
> > -
> > -     err |= kho_preserve_folio(page_folio(kho_fdt));
> > -     err |= kho_add_subtree(ser, MEMBLOCK_KHO_FDT, page_to_virt(kho_fdt));
> > -
> > -     return notifier_from_errno(err);
> > -}
> > -
> > -static int reserve_mem_kho_notifier(struct notifier_block *self,
> > -                                 unsigned long cmd, void *v)
> > -{
> > -     switch (cmd) {
> > -     case KEXEC_KHO_FINALIZE:
> > -             return reserve_mem_kho_finalize((struct kho_serialization *)v);
> > -     case KEXEC_KHO_ABORT:
> > -             return NOTIFY_DONE;
> > -     default:
> > -             return NOTIFY_BAD;
> > -     }
> > -}
> > -
> > -static struct notifier_block reserve_mem_kho_nb = {
> > -     .notifier_call = reserve_mem_kho_notifier,
> > -};
> >
> >  static int __init prepare_kho_fdt(void)
> >  {
> >       int err = 0, i;
> > +     struct page *fdt_page;
> >       void *fdt;
> >
> > -     kho_fdt = alloc_page(GFP_KERNEL);
> > -     if (!kho_fdt)
> > +     fdt_page = alloc_page(GFP_KERNEL);
> > +     if (!fdt_page)
> >               return -ENOMEM;
> >
> > -     fdt = page_to_virt(kho_fdt);
> > +     fdt = page_to_virt(fdt_page);
> >
> >       err |= fdt_create(fdt, PAGE_SIZE);
> >       err |= fdt_finish_reservemap(fdt);
> > @@ -2499,7 +2464,10 @@ static int __init prepare_kho_fdt(void)
> >       err |= fdt_property_string(fdt, "compatible", MEMBLOCK_KHO_NODE_COMPATIBLE);
> >       for (i = 0; i < reserved_mem_count; i++) {
> >               struct reserve_mem_table *map = &reserved_mem_table[i];
> > +             struct page *page = phys_to_page(map->start);
> > +             unsigned int nr_pages = map->size >> PAGE_SHIFT;
> >
> > +             err |= kho_preserve_pages(page, nr_pages);
> >               err |= fdt_begin_node(fdt, map->name);
> >               err |= fdt_property_string(fdt, "compatible", RESERVE_MEM_KHO_NODE_COMPATIBLE);
> >               err |= fdt_property(fdt, "start", &map->start, sizeof(map->start));
> > @@ -2507,13 +2475,14 @@ static int __init prepare_kho_fdt(void)
> >               err |= fdt_end_node(fdt);
> >       }
> >       err |= fdt_end_node(fdt);
> > -
> >       err |= fdt_finish(fdt);
> >
> > +     err |= kho_preserve_folio(page_folio(fdt_page));
> > +     err |= kho_add_subtree(MEMBLOCK_KHO_FDT, fdt);
> > +
> >       if (err) {
> >               pr_err("failed to prepare memblock FDT for KHO: %d\n", err);
> > -             put_page(kho_fdt);
> > -             kho_fdt = NULL;
> > +             put_page(fdt_page);
>
> This adds subtree to KHO even if the FDT might be invalid. And then
> leaves a dangling reference in KHO to the FDT in case of an error. I
> think you should either do this check after
> kho_preserve_folio(page_folio(fdt_page)) and do a clean error check for
> kho_add_subtree(), or call kho_remove_subtree() in the error block.

I agree, I do not like these err |= stuff, we should be checking
errors cleanly, and do proper clean-ups.

> I prefer the former since if kho_add_subtree() is the one that fails,
> there is little sense in removing a subtree that was never added.
>
> >       }
> >
> >       return err;
> > @@ -2529,13 +2498,6 @@ static int __init reserve_mem_init(void)
> >       err = prepare_kho_fdt();
> >       if (err)
> >               return err;
> > -
> > -     err = register_kho_notifier(&reserve_mem_kho_nb);
> > -     if (err) {
> > -             put_page(kho_fdt);
> > -             kho_fdt = NULL;
> > -     }
> > -
> >       return err;
> >  }
> >  late_initcall(reserve_mem_init);
>
> --
> Regards,
> Pratyush Yadav

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ