lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230914-munkeln-pelzmantel-3e3a761acb72@brauner>
Date:   Thu, 14 Sep 2023 16:02:25 +0200
From:   Christian Brauner <brauner@...nel.org>
To:     Al Viro <viro@...iv.linux.org.uk>
Cc:     Christoph Hellwig <hch@....de>, Heiko Carstens <hca@...ux.ibm.com>,
        Vasily Gorbik <gor@...ux.ibm.com>,
        Alexander Gordeev <agordeev@...ux.ibm.com>,
        Fenghua Yu <fenghua.yu@...el.com>,
        Reinette Chatre <reinette.chatre@...el.com>,
        Miquel Raynal <miquel.raynal@...tlin.com>,
        Richard Weinberger <richard@....at>,
        Vignesh Raghavendra <vigneshr@...com>,
        Dennis Dalessandro <dennis.dalessandro@...nelisnetworks.com>,
        Tejun Heo <tj@...nel.org>,
        Trond Myklebust <trond.myklebust@...merspace.com>,
        Anna Schumaker <anna@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Damien Le Moal <dlemoal@...nel.org>,
        Naohiro Aota <naohiro.aota@....com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        linux-usb@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-s390@...r.kernel.org, linux-rdma@...r.kernel.org,
        linux-nfs@...r.kernel.org, linux-hardening@...r.kernel.org,
        cgroups@...r.kernel.org, Jan Kara <jack@...e.cz>
Subject: Re: [PATCH 03/19] fs: release anon dev_t in deactivate_locked_super

> Christoph, could you explain what the hell do we need that for?  It does
> create the race in question and AFAICS 2c18a63b760a (and followups trying
> to plug holes in it) had been nothing but headache.
> 
> Old logics: if mount attempt with a different fs type happens, -EBUSY
> is precisely corrent - we would've gotten just that if mount() came
> before umount().  If the type matches, we might
> 	1) come before deactivate_locked_super() by umount(2).
> No problem, we succeed.
> 	2) come after the beginning of shutdown, but before the
> removal from the list; fine, we'll wait for the sucker to be
> unlocked (which happens in the end of generic_shutdown_super()),
> notice it's dead and create a new superblock.  Since the only
> part left on the umount side is closing the device, we are
> just fine.
> 	3) come after the removal from the list.  So we won't
> wait for the old superblock to be unlocked, other than that
> it's exactly the same as (2).  It doesn't matter whether we
> open the device before or after close by umount - same owner
> anyway, no -EBUSY.
> 
> Your "owner shall be the superblock" breaks that...
> 
> If you want to mess with _three_-way split of ->kill_sb(),
> please start with writing down the rules re what should
> go into each of those parts; such writeup should go into
> Documentation/filesystems/porting anyway, even if the
> split is a two-way one, BTW.

Hm, I think that characterization of Christoph's changes is a bit harsh.

Yes, you're right that making the superblock and not the filesytem type
the bd_holder changes the logic and we are aware of that of course. And
it requires changes such as moving additional block device closing from
where some callers currently do it.

But the filesytem type is not a very useful holder itself and has other
drawbacks. The obvious one being that it requires us to wade through all
superblocks on the system trying to find the superblock associated with
a given block device continously grabbing and dropping sb_lock and
s_umount. None of that is very pleasant nor elegant and it is for sure
not very easy to understand (Plus, it's broken for btrfs freezing and
syncing via block level ioctls.).

Using the superblock as holder makes this go away and is overall a lot
more useful and intuitive and can be extended to filesystems with
multiple devices (Of which we apparently are bound to get more.).

So I think this change is worth the pain.

It's a fair point that these lifetime rules should be documented in
Documentation/filesystems/. The old lifetime documentation is too sparse
to be useful though.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ