linux-kernel - Re: [PATCH v2 0/6] Composefs: an opportunistically sharing verified image filesystem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <46f05b5f986b4377f02907c2660f2d144030c3b8.camel@redhat.com>
Date:   Fri, 20 Jan 2023 10:22:24 +0100
From:   Alexander Larsson <alexl@...hat.com>
To:     Christian Brauner <brauner@...nel.org>,
        Amir Goldstein <amir73il@...il.com>
Cc:     Gao Xiang <hsiangkao@...ux.alibaba.com>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        gscrivan@...hat.com, Miklos Szeredi <miklos@...redi.hu>,
        Yurii Zubrytskyi <zyy@...gle.com>,
        Eugene Zemtsov <ezemtsov@...gle.com>,
        Vivek Goyal <vgoyal@...hat.com>
Subject: Re: [PATCH v2 0/6] Composefs: an opportunistically sharing verified
 image filesystem

On Tue, 2023-01-17 at 11:12 +0100, Christian Brauner wrote:
> On Tue, Jan 17, 2023 at 09:05:53AM +0200, Amir Goldstein wrote:
> > > It seems rather another an incomplete EROFS from several points
> > > of view.  Also see:
> > > https://lore.kernel.org/all/1b192a85-e1da-0925-ef26-178b93d0aa45@plexistor.com/T/#u
> > > 
> > 
> > Ironically, ZUFS is one of two new filesystems that were discussed
> > in LSFMM19,
> > where the community reactions rhyme with the reactions to
> > composefs.
> > The discussion on Incremental FS resembles composefs case even more
> > [1].
> > AFAIK, Android is still maintaining Incremental FS out-of-tree.
> > 
> > Alexander and Giuseppe,
> > 
> > I'd like to join Gao is saying that I think it is in the best
> > interest
> > of everyone,
> > composefs developers and prospect users included,
> > if the composefs requirements would drive improvement to existing
> > kernel subsystems rather than adding a custom filesystem driver
> > that partly duplicates other subsystems.
> > 
> > Especially so, when the modifications to existing components
> > (erofs and overlayfs) appear to be relatively minor and the
> > maintainer
> > of erofs is receptive to new features and happy to collaborate with
> > you.
> > 
> > w.r.t overlayfs, I am not even sure that anything needs to be
> > modified
> > in the driver.
> > overlayfs already supports "metacopy" feature which means that an
> > upper layer
> > could be composed in a way that the file content would be read from
> > an arbitrary
> > path in lower fs, e.g. objects/cc/XXX.
> > 
> > I gave a talk on LPC a few years back about overlayfs and container
> > images [2].
> > The emphasis was that overlayfs driver supports many new features,
> > but userland
> > tools for building advanced overlayfs images based on those new
> > features are
> > nowhere to be found.
> > 
> > I may be wrong, but it looks to me like composefs could potentially
> > fill this void,
> > without having to modify the overlayfs driver at all, or maybe just
> > a
> > little bit.
> > Please start a discussion with overlayfs developers about missing
> > driver
> > features if you have any.
> 
> Surprising that I and others weren't Cced on this given that we had a
> meeting with the main developers and a few others where we had said
> the
> same thing. I hadn't followed this. 

Sorry about that, I'm just not very used to the kernel submission
mechanism. I'll CC you on the next version.

> 
> We have at least 58 filesystems currently in the kernel (and that's a
> conservative count just based on going by obvious directories and
> ignoring most virtual filesystems).
> 
> A non-insignificant portion is probably slowly rotting away with
> little
> fixes coming in, with few users, and not much attention is being paid
> to
> syzkaller reports for them if they show up. I haven't quantified this
> of
> course.
> 
> Taking in a new filesystems into kernel in the worst case means that
> it's being dumped there once and will slowly become unmaintained.
> Then
> we'll have a few users for the next 20 years and we can't reasonably
> deprecate it (Maybe that's another good topic: How should we fade out
> filesystems.).
> 
> Of course, for most fs developers it probably doesn't matter how many
> other filesystems there are in the kernel (aside from maybe competing
> for the same users).
> 
> But for developers who touch the vfs every new filesystems may
> increase
> the cost of maintaining and reworking existing functionality, or
> adding
> new functionality. Making it more likely to accumulate hacks, adding
> workarounds, or flatout being unable to kill off infrastructure that
> should reasonably go away. Maybe this is an unfair complaint but just
> from experience a new filesystem potentially means one or two weeks
> to
> make a larger vfs change.
> 
> I want to stress that I'm not at all saying "no more new fs" but we
> should be hesitant before we merge new filesystems into the kernel.

Well, it sure reads as "no more new fs" to me. But I understand that
there is hesitation towards this. The new version will be even simpler
(based on feedback from dave), weighing in at < 2000 lines. Hopefully
this will make it easier to review and maintain and somewhat countering
the cost of yet another filesystem.

> Especially for filesystems that are tailored to special use-cases.
> Every few years another filesystem tailored to container use-cases
> shows
> up. And frankly, a good portion of the issues that they are trying to
> solve are caused by design choices in userspace.

Well, we have at least two use cases, but sure, it is not a general
purpose filesystem.

> And I have to say I'm especially NAK-friendly about anything that
> comes
> even close to yet another stacking filesystems or anything that
> layers
> on top of a lower filesystem/mount such as ecryptfs, ksmbd, and
> overlayfs. They are hard to get right, with lots of corner cases and
> they cause the most headaches when making vfs changes.

I can't disagree here, because I'm not a vfs maintainer, but I will say
that composefs is fundamentally much simpler that these examples. First
because it is completely read-only, and secondly because it doesn't
rely on the lower filesystem for anything but file content (i.e. lower
fs metadata or directory structure doesn't affect the upper fs).

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
=-=-=
 Alexander Larsson                                            Red Hat,
Inc 
       alexl@...hat.com            alexander.larsson@...il.com 
He's a jaded white trash astronaut haunted by an iconic dead American 
confidante She's a brilliant extravagent femme fatale who can talk to 
animals. They fight crime!