[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZqhV+PvVKESa3UVw@dread.disaster.area>
Date: Tue, 30 Jul 2024 12:54:48 +1000
From: Dave Chinner <david@...morbit.com>
To: Steve French <smfrench@...il.com>
Cc: Christian Brauner <brauner@...nel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
CIFS <linux-cifs@...r.kernel.org>
Subject: Re: Why do very few filesystems have umount helpers
On Mon, Jul 29, 2024 at 04:50:27PM -0500, Steve French wrote:
> On Mon, Jul 29, 2024 at 4:50 AM Christian Brauner <brauner@...nel.org> wrote:
> On Mon, Jul 29, 2024 at 12:33 PM Steve French <smfrench@...il.com> wrote:
> > > The first step should be to identify what exactly keeps your mount busy
> > > in generic/044 and generic/043.
> >
> > That is a little tricky to debug (AFAIK no easy way to tell exactly which
> > reference is preventing the VFS from proceeding with the umount and
> > calling kill_sb). My best guess is something related to deferred close
> > (cached network file handles) that had a brief refcount on
> > something being checked by umount, but when I experimented with
> > deferred close settings that did not seem to affect the problem so
> > looking for other possible causes.
> >
> > I just did a quick experiment by adding a 1 second wait inside umount
> > and confirmed that that does fix it for those two tests when mounted to Samba,
> > but not clear why the slight delay in umount helps as there is no pending
> > network traffic at that point.
>
> I did some more experimentation and it looks like the umount problem
> with those two xfstests to Samba is related to IOC_SHUTDOWN.
> If I return EOPNOTSUPP on IOC_SHUTDOWN
> then the 1 second delay in umount is not necessary - so something that
> happens after IOC_SHUTDOWN races with umount (thus the 1 second delay
> that I tried as a quick experiment fixes it indirectly) in this
> testcase (although
> apparently this race between IOC_SHUTDOWN and umount is not an issue
> to some other servers but is reproducible to Samba and ksmbd (at least
> in some easy to setup configurations)
So you've likely got a race condition where something takes longer
when the shutdown flag is set then when the filesystem is operating
normally. There's not a lot in the CIFS code that pays attention to
the shutdown flag - almost all of them are aborting front end
(syscall) operations before they are started.
The only back end check appears to be in cifs_issue_write(). Perhaps
that is failing to wake the request queue when it is being failed
with -EIO on a shutdown, and so it takes some time for something
else to wake it up and empty it and complete the pending writes
before the fs can be unmounted...
-Dave.
--
Dave Chinner
david@...morbit.com
Powered by blists - more mailing lists