[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAH2r5mu1KvG8xeYmrYg9_HYAiZF8Z0URrzEg+0ZKS7hSn7JyJA@mail.gmail.com>
Date: Sun, 28 Jul 2024 23:33:53 -0500
From: Steve French <smfrench@...il.com>
To: Al Viro <viro@...iv.linux.org.uk>
Cc: linux-fsdevel <linux-fsdevel@...r.kernel.org>, LKML <linux-kernel@...r.kernel.org>,
CIFS <linux-cifs@...r.kernel.org>, ronnie sahlberg <ronniesahlberg@...il.com>
Subject: Re: Why do very few filesystems have umount helpers
And here is a recent bugzilla related to umount issues (although this
could be distinct from the three other umount issues I mentioned
above)
https://bugzilla.kernel.org/show_bug.cgi?id=219097
On Sun, Jul 28, 2024 at 11:16 PM Steve French <smfrench@...il.com> wrote:
>
> On Sun, Jul 28, 2024 at 7:01 PM Al Viro <viro@...iv.linux.org.uk> wrote:
> >
> > On Sun, Jul 28, 2024 at 02:09:14PM -0500, Steve French wrote:
> >
> > > Since umount does not notify the filesystem on unmount until
> > > references are closed (unless you do "umount --force") and therefore
> > > the filesystem is only notified at kill_sb time, an easier approach to
> > > fixing some of the problems where resources are kept around too long
> > > (e.g. cached handles or directory entries etc. or references on the
> > > mount are held) may be to add a mount helper which notifies the fs
> > > (e.g. via fs specific ioctl) when umount has begun. That may be an
> > > easier solution that adding a VFS call to notify the fs when umount
> > > begins.
> >
> > "references on the mount being held" is not something any userland
> > helpers have a chance to help with.
>
> I don't know the exact reasons why at least three filesystems have umount
> helpers but presumably they can issue private ioctls (or equivalent)
> to release resources, but I am very curious if their reasons would
> overlap any common SMB3.1.1 network mount use cases.
>
> > What exactly gets leaked in your tests? And what would that userland
> > helper do when umount happens due to the last process in given namespace
> > getting killed, for example? Any unexpected "busy" at umount(2) time
> > would translate into filesystem instances stuck around (already detached
> > from any mount trees) for unspecified time; not a good thing, obviously,
> > and not something a userland helper had a chance to help with...
> >
> > Details, please.
>
> There are three things in particular that got me thinking about how
> other filesystems handle umount (and whether the umount helper
> concept is a bad idea or a good idea for network fs)
>
> 1) Better resource usage: network filesystems often have cached
> information due to
> leases (or 'delegations' in NFS terminology) on files or directory
> entries. Waiting until kill_superblock (rather than when umount began) can
> waste resources. This cached information is not automatically released
> when the file or directory is closed (note that "deferred close" of files can
> be a huge performance win for network filesystems which support safe
> caching via leases/delegations) ... but these caches consume
> resources that ideally would be freed when umount starts, but often have to
> wait longer until kill_sb is invoked to be freed. If "umount_begin"
> were called always e.g. then (assuming this were not multiple mounts
> from the same client that server share) cifs.ko could
> a) close all deferred network file handles (freeing up some resources)
> b) stop waiting for any pending network i/o requests
> c) mark the tree connection (connection to the server share) as "EXITING"
> so we don't have races sending new i/o operations on that share
>
> 2) fixing races between umount and mount:
> There are some common test scenarios where we can run a series of
> xfstests that will eventually fail (e.g. by the time xfstest runs gets
> to 043 or 044
> (to Samba server on localhost e.g.) they sometimes hit races which
> cause this message:
>
> QA output created by 043
> +umount: /mnt-local-xfstest/scratch: target is busy.
> +mount error(16): Device or resource busy
>
> but it works fine if delay is inserted between these tests. I will
> try some experiments to
> see if changing xfstests to call force unmount which calls "umount_begin" (or
> adding a umount wrapper to do the same) also avoids the problem. It
> could be that
> references may be being held by cifs.ko briefly that are causing the VFS to
> think that files are open and not calling into cifs.ko to
> kill_superblock. This needs
> more investigation but "umount --force" (or equivalent) may help.
>
> 3) races in cleaning up directory cache information. There was a
> patch introduced for
> periodically cleaning up the directory cache (this is only an issue to
> servers like
> Windows or NetApp etc. that support directory leases so you don't see
> it to Samba
> and various other common servers that don't enable directory leases)
> that can cause
> crashes in unmount (use after free). I want to try to narrow it down
> soon, but it was a little
> tricky (and the assumption was that force unmount would avoid the
> problem - ie the call to
> "umount_begin"). Looks like this patch causes the intermittent umount
> crash to Windows
> servers:
>
> commit d14de8067e3f9653cdef5a094176d00f3260ab20
> Author: Ronnie Sahlberg <lsahlber@...hat.com>
> Date: Thu Jul 6 12:32:24 2023 +1000
>
> cifs: Add a laundromat thread for cached directories
>
> and drop cached directories after 30 seconds
>
>
>
> Steve
--
Thanks,
Steve
Powered by blists - more mailing lists