linux-kernel - Re: [PATCH] fix writing to the filesystem after unmount

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20230908-verflachen-neudefinition-4da649d673a9@brauner>
Date:   Fri, 8 Sep 2023 14:02:38 +0200
From:   Christian Brauner <brauner@...nel.org>
To:     Jan Kara <jack@...e.cz>
Cc:     Zdenek Kabelac <zkabelac@...hat.com>,
        Mikulas Patocka <mpatocka@...hat.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        dm-devel@...hat.com, Christoph Hellwig <hch@....de>,
        "Darrick J. Wong" <djwong@...nel.org>
Subject: Re: [PATCH] fix writing to the filesystem after unmount

> Well, currently you click some "Eject / safely remove / whatever" button
> and then you get a "wait" dialog until everything is done after which
> you're told the stick is safe to remove. What I imagine is that the "wait"
> dialog needs to be there while there are any (or exclusive at minimum) openers
> of the device. Not until umount(2) syscall has returned. And yes, the

Agreed. umount(2) doesn't give guarantees about a filesystem being
really gone once it has returned. And it really shouldn't. There's too
many cases where that doesn't work and it's not a commitment we should
make.

And there are ways to wait until superblock shutdown that I've mentioned
before in other places where it somehow really matters. inotify's
IN_UMOUNT will notify about superblock shutdown. IOW, when it really
hits generic_shutdown_super() which can only be hit after unfreezing as
that requires active references.

So this really can be used to wait for a filesystem to go away across
all namespaces, and across filesytem freezing and it's available to
unprivileged users. Simple example:

# shell 1
sudo mount -t xfs /dev/sda /mnt
sudo mount --bind /mnt /opt
inotifywait -e unmount /mnt

#shell 2
sudo umount /opt # nothing happens in shell 1
sudo umount /mnt # shell 1 gets woken

> corner-cases. So does the current behavior, I agree, but improving
> situation for one usecase while breaking another usecase isn't really a way
> forward...

Agreed.

> Well, the filesystem (struct superblock to be exact) is invisible in
> /proc/mounts (or whatever), that is true. But it is still very much
> associated with that block device and if you do 'mount <device>
> <mntpoint>', you'll get it back. But yes, the filesystem will not go away

And now we at least have an api to detect that case and refuse to reuse
the superblock.

> until all references to it are dropped and you cannot easily find who holds
> those references and how to get rid of them.

Namespaces make this even messier. You have no easy way of knowing
whether the filesystem isn't pinned somewhere else through an explicit
bind-mount or when it was copied during mount namespace creation.