linux-kernel - Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CA+CK2bAgsPQNCDnsQV9RR7gYo+Vdye9oDkrGJwrgmSZm9vbwUQ@mail.gmail.com>
Date: Sun, 15 Jun 2025 14:02:28 -0400
From: Pasha Tatashin <pasha.tatashin@...een.com>
To: Pratyush Yadav <pratyush@...nel.org>
Cc: jasonmiu@...gle.com, graf@...zon.com, changyuanl@...gle.com, 
	rppt@...nel.org, dmatlack@...gle.com, rientjes@...gle.com, corbet@....net, 
	rdunlap@...radead.org, ilpo.jarvinen@...ux.intel.com, kanie@...ux.alibaba.com, 
	ojeda@...nel.org, aliceryhl@...gle.com, masahiroy@...nel.org, 
	akpm@...ux-foundation.org, tj@...nel.org, yoann.congal@...le.fr, 
	mmaurer@...gle.com, roman.gushchin@...ux.dev, chenridong@...wei.com, 
	axboe@...nel.dk, mark.rutland@....com, jannh@...gle.com, 
	vincent.guittot@...aro.org, hannes@...xchg.org, dan.j.williams@...el.com, 
	david@...hat.com, joel.granados@...nel.org, rostedt@...dmis.org, 
	anna.schumaker@...cle.com, song@...nel.org, zhangguopeng@...inos.cn, 
	linux@...ssschuh.net, linux-kernel@...r.kernel.org, linux-doc@...r.kernel.org, 
	linux-mm@...ck.org, gregkh@...uxfoundation.org, tglx@...utronix.de, 
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com, x86@...nel.org, 
	hpa@...or.com, rafael@...nel.org, dakr@...nel.org, 
	bartosz.golaszewski@...aro.org, cw00.choi@...sung.com, 
	myungjoo.ham@...sung.com, yesanishhere@...il.com, Jonathan.Cameron@...wei.com, 
	quic_zijuhu@...cinc.com, aleksander.lobakin@...el.com, ira.weiny@...el.com, 
	andriy.shevchenko@...ux.intel.com, leon@...nel.org, lukas@...ner.de, 
	bhelgaas@...gle.com, wagi@...nel.org, djeffery@...hat.com, 
	stuart.w.hayes@...il.com
Subject: Re: [RFC v2 08/16] luo: luo_files: add infrastructure for FDs

> > This is not safe, the memory might be DMA or owned by a sensetive
> > process, and if we proceed liveupdate reboot without properly handling
> > memory, we can get corruptions, and memory leaks. Therefore, during
> > liveupdate boot if there are exceptions, we should panic.
>
> I don't get how it would result in memory leaks or corruptions, since
> KHO would have marked that memory as preserved, and the new kernel won't
> touch it until someone restores it.
>
> So it can at most lead to loss of data, and in that case, userspace can
> very well decide if it can live with that loss or not.
>
> Or are you assuming here that even data in KHO is broken? In that case,
> it would probably be a good idea to panic early.

A broken LUO format is a catastrophic failure. It's unclear at this
point in boot whether the problem lies with KHO, LUO itself, or
mismatched interface assumptions between kernel versions. Regardless,
falling back to a cold reboot is the safest course of action, rather
than attempting to boot into a potentially broken environment. Since
VMs or any preserved userspace won't survive, the additional delay of
a full reboot should not significantly worsen the impact.

>
> [...]
> >> > +             }
> >> > +
> >> > +             luo_file = kmalloc(sizeof(*luo_file),
> >> > +                                GFP_KERNEL | __GFP_NOFAIL);
> >> > +             luo_file->fs = fs;
> >> > +             luo_file->file = NULL;
> >> > +             memcpy(&luo_file->private_data, data_ptr, sizeof(u64));
> >>
> >> Why not make sure data_ptr is exactly sizeof(u64) when we parse it, and
> >> then simply do luo_file->private_data = (u64)*data_ptr ?
> >
> > Because FDT alignment is 4 bytes, we can't simply assign it.
>
> Hmm, good catch. Didn't think of that.
>
> >
> >> Because if the previous kernel wrote more than a u64 in data, then
> >> something is broken and we should catch that error anyway.
> >>
> >> > +             luo_file->reclaimed = false;
> >> > +             mutex_init(&luo_file->mutex);
> >> > +             luo_file->state = LIVEUPDATE_STATE_UPDATED;
> >> > +             ret = xa_err(xa_store(&luo_files_xa_in, token, luo_file,
> >> > +                                   GFP_KERNEL | __GFP_NOFAIL));
> >>
> [...]
> >> > +struct liveupdate_filesystem {
> >> > +     int (*prepare)(struct file *file, void *arg, u64 *data);
> >> > +     int (*freeze)(struct file *file, void *arg, u64 *data);
> >> > +     void (*cancel)(struct file *file, void *arg, u64 data);
> >> > +     void (*finish)(struct file *file, void *arg, u64 data, bool reclaimed);
> >> > +     int (*retrieve)(void *arg, u64 data, struct file **file);
> >> > +     bool (*can_preserve)(struct file *file, void *arg);
> >> > +     const char *compatible;
> >> > +     void *arg;
> >>
> >> What is the use for this arg? I would expect one file type/system to
> >> register one set of handlers. So they can keep their arg in a global in
> >> their code. I don't see why a per-filesystem arg is needed.
> >
> > I think, arg is useful in case we support a subsystem is registered
> > multiple times with some differences: i.e. based on mount point, or
> > file types handling. Let's keep it for now, but if needed, we can
> > remove that in future revisions.
> >
> >> What I do think is needed is a per-file arg. Each callback gets 'data',
> >> which is the serialized data, but there is no place to store runtime
> >> state, like some flags or serialization metadata. Sure, you could make
> >> place for it somewhere in the inode, but I think it would be a lot
> >> cleaner to be able to store it in struct luo_file.
> >>
> >> So perhaps rename private_data in struct luo_file to say
> >> serialized_data, and have a field called "private" that filesystems can
> >> use for their runtime state?
> >
> > I am not against this, but let's make this change when it is actually
> > needed by a registered filesystem.
>
> Okay, fair enough.
>
> --
> Regards,
> Pratyush Yadav