linux-kernel - Re: [RFC v2 05/16] luo: luo

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bA7eAB4PvF0RXtt2DJ+FQ4DVV3x1OZrVo4q3EvgowhvJg@mail.gmail.com>
Date: Sat, 7 Jun 2025 19:30:50 -0400
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Pratyush Yadav <pratyush@...nel.org>
Cc: jasonmiu@...gle.com, graf@...zon.com, changyuanl@...gle.com, 
	rppt@...nel.org, dmatlack@...gle.com, rientjes@...gle.com, corbet@....net, 
	rdunlap@...radead.org, ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com, 
	ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org, 
	akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr, 
	mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com, 
	axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com, 
	vincent.guittot@...aro.org, hannes@...xchg.org, dan.j.williams@...el.com, 
	david@...hat.com, joel.granados@...nel.org, rostedt@...dmis.org, 
	anna.schumaker@...cle.com, song@...nel.org, zhangguopeng@...inos.cn, 
	linux@...ssschuh.net, linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, 
	linux-mm@...ck.org, gregkh@...uxfoundation.org, tglx@...utronix.de, 
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org, 
	hpa@...or.com, rafael@...nel.org, dakr@...nel.org, 
	bartosz.golaszewski@...aro.org, cw00.choi@...sung.com, 
	myungjoo.ham@...sung.com, yesanishhere@...il.com, Jonathan.Cameron@...wei.com, 
	quic_zijuhu@...cinc.com, aleksander.lobakin@...el.com, ira.weiny@...el.com, 
	andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de, 
	bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com, 
	stuart.w.hayes@...il.com
Subject: Re: [RFC v2 05/16] luo: luo_core: integrate with KHO

> > +     fdt_out = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
> > +                                        get_order(LUO_FDT_SIZE));
>
> Why not alloc_folio()? KHO already deals with folios so it seems
> simpler. The kho_{un,}preserve_folio() functions exist exactly for these
> kinds of allocations, so LUO also ends up being a first user. You also
> won't end up needing kho_unpreserve_phys() and all the __pa() calls.

I prefer phys here, because this way, we are not bound for size and
alignment to be of a specific order, it can be n-pages instead.

> > +     if (!fdt_out) {
> > +             pr_err("failed to allocate FDT memory\n");
> > +             return -ENOMEM;
> > +     }
> > +
> > +     ret = fdt_create_empty_tree(fdt_out, LUO_FDT_SIZE);
>
> You are using FDT read/write functions throughout the series to create
> new FDTs. The sequential write functions are generally more efficient
> since they are meant for creating new FDT blobs. The read/write
> functions are better for modifying an existing FDT blob.
>
> Is there a particular reason you do this?
>
> When using FDT SW functions, the creation of the tree would be something
> like:
>
>         fdt_create()
>         fdt_finish_reservemap()
>         fdt_begin_node()
>
>         // Add stuff to FDT
>
>         fdt_end_node()
>         fdt_finish()
>
> In this patch, the FDT does not change much after creation so it doesn't
> look like it matters much, but in later patches, the FDT is passed to
> luo_files_fdt_setup() and luo_subsystems_fdt_setup() which probably
> modify the FDT a fair bit.

The number of changes to empty tree FDT is small, and this is done
only once, so I do think the extra cost is substantial.  This could be
a future optimization. Also, we could use a hybird approach where
luo_files/luo_subsystems do the SW updates, while here we do
Read/Write updates.

> > +     if (ret)
> > +             goto exit_free;
> > +
> > +     ret = fdt_setprop(fdt_out, 0, "compatible", LUO_COMPATIBLE,
> > +                       strlen(LUO_COMPATIBLE) + 1);
>
> fdt_setprop_string() instead? Or if you change to FDT SW,

Updated, thanks!

> fdt_property_string().
>
> > +     if (ret)
> > +             goto exit_free;
> > +
> > +     ret = kho_preserve_phys(__pa(fdt_out), LUO_FDT_SIZE);
> > +     if (ret)
> > +             goto exit_free;
> > +
> > +     ret = kho_add_subtree(ser, LUO_KHO_ENTRY_NAME, fdt_out);
> > +     if (ret)
> > +             goto exit_unpreserve;
> > +     luo_fdt_out = fdt_out;
> > +
> > +     return 0;
> > +
> > +exit_unpreserve:
> > +     kho_unpreserve_phys(__pa(fdt_out), LUO_FDT_SIZE);
> > +exit_free:
> > +     free_pages((unsigned long)fdt_out, get_order(LUO_FDT_SIZE));
> > +     pr_err("failed to prepare LUO FDT: %d\n", ret);
> > +
> > +     return ret;
> > +}
> > +
> > +static void luo_fdt_destroy(void)
> > +{
> > +     kho_unpreserve_phys(__pa(luo_fdt_out), LUO_FDT_SIZE);
> > +     free_pages((unsigned long)luo_fdt_out, get_order(LUO_FDT_SIZE));
> > +     luo_fdt_out = NULL;
> > +}
> > +
> > +static int luo_do_prepare_calls(void)
> > +{
> > +     return 0;
> > +}
> > +
> >  static int luo_do_freeze_calls(void)
> >  {
> >       return 0;
> > @@ -88,11 +151,111 @@ static void luo_do_finish_calls(void)
> >  {
> >  }
> >
> > -int luo_prepare(void)
> > +static void luo_do_cancel_calls(void)
> > +{
> > +}
> > +
> > +static int __luo_prepare(struct kho_serialization *ser)
> >  {
> > +     int ret;
> > +
> > +     if (down_write_killable(&luo_state_rwsem)) {
> > +             pr_warn("[prepare] event canceled by user\n");
> > +             return -EAGAIN;
> > +     }
> > +
> > +     if (!is_current_luo_state(LIVEUPDATE_STATE_NORMAL)) {
> > +             pr_warn("Can't switch to [%s] from [%s] state\n",
> > +                     luo_state_str[LIVEUPDATE_STATE_PREPARED],
> > +                     LUO_STATE_STR);
> > +             ret = -EINVAL;
> > +             goto exit_unlock;
> > +     }
> > +
> > +     ret = luo_fdt_setup(ser);
> > +     if (ret)
> > +             goto exit_unlock;
> > +
> > +     ret = luo_do_prepare_calls();
> > +     if (ret)
> > +             goto exit_unlock;
>
> With subsystems/filesystems support in place, this can fail. But since
> luo_fdt_setup() called kho_add_subtree(), the debugfs file stays around,
> and later calls to __luo_prepare() fail because the next
> kho_add_subtree() tries to create a debugfs file that already exists. So
> you would see an error like below:
>
>     [  767.339920] debugfs: File 'LUO' in directory 'sub_fdts' already present!
>     [  767.340613] luo_core: failed to prepare LUO FDT: -17
>     [  767.341071] KHO: Failed to convert KHO state tree: -17
>     [  767.341593] luo_core: Can't switch to [normal] from [normal] state
>     [  767.342175] KHO: Failed to abort KHO finalization: -22
>
> You probably need a kho_remove_subtree() that can be called from the
> error paths.
> Note that __luo_cancel() is called because failure in a KHO finalize
> notifier calls the abort notifiers.
>
> This is also something to fix, since if prepare fails, all other KHO
> users who are already serialized won't even get to abort.

Thank you for reporting this. This should not be happening, because if
__luo_prepare() fails, the kho_abort should follow, however, KHO does
not do kho_out_update_debugfs_fdt() when kho_finalize() fails, so I
added this callback and it fixes this problem. I also added a selftest
case for this.

>
> This weirdness happens because luo_prepare() and luo_cancel() control
> the KHO state machine, but then also get controlled by it via the
> notifier callbacks. So the relationship between then is not clear.
> __luo_prepare() at least needs access to struct kho_serialization, so it
> needs to come from the callback. So I don't have a clear way to clean
> this all up off the top of my head.

On production machine, without KHO_DEBUGFS, only LUO can control KHO
state, but if debugfs is enabled, KHO can be finalized manually, and
in this case LUO transitions to prepared state. In both cases, the
path is identical. The KHO debugfs path is only for
developers/debugging purposes.

> >  static int __init luo_startup(void)
> >  {
> > -     __luo_set_state(LIVEUPDATE_STATE_NORMAL);
> > +     phys_addr_t fdt_phys;
> > +     int ret;
> > +
> > +     if (!kho_is_enabled()) {
> > +             if (luo_enabled)
> > +                     pr_warn("Disabling liveupdate because KHO is disabled\n");
> > +             luo_enabled = false;
> > +             return 0;
> > +     }
> > +
> > +     ret = register_kho_notifier(&luo_kho_notifier_nb);
> > +     if (ret) {
> > +             luo_enabled = false;
>
> You set luo_enabled to false here, but none of the LUO entry points like
> luo_prepare() or luo_freeze() actually check it. So LUO will appear work
> just fine even when it hasn't initialized properly.

luo_enabled check was missing from luo_ioctl.c, as we should not
create a device if LUO is not enabled. This is fixed.

>
> > +             pr_warn("Failed to register with KHO [%d]\n", ret);
>
> I guess you don't return here so a previous liveupdate can still be
> recovered, even though we won't be able to make the next one. If so, a
> comment would be nice to point this out.

This is correct, but this is not going to work. Because, with the
current change I am disabling "/dev/liveupdate" iff luo_enable ==
false. Let's just return here, failing to register with KHO should not
really happen, it usually means that there is another notifier with
the same name has already registered.