[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <170224845504.12910.16483736613606611138@noble.neil.brown.name>
Date: Mon, 11 Dec 2023 09:47:35 +1100
From: "NeilBrown" <neilb@...e.de>
To: "Chuck Lever" <chuck.lever@...cle.com>
Cc: "Al Viro" <viro@...iv.linux.org.uk>,
"Christian Brauner" <brauner@...nel.org>,
"Jens Axboe" <axboe@...nel.dk>, "Oleg Nesterov" <oleg@...hat.com>,
"Jeff Layton" <jlayton@...nel.org>, linux-fsdevel@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-nfs@...r.kernel.org
Subject: Re: [PATCH 1/3] nfsd: use __fput_sync() to avoid delayed closing of files.
On Sat, 09 Dec 2023, Chuck Lever wrote:
> On Fri, Dec 08, 2023 at 02:27:26PM +1100, NeilBrown wrote:
> > Calling fput() directly or though filp_close() from a kernel thread like
> > nfsd causes the final __fput() (if necessary) to be called from a
> > workqueue. This means that nfsd is not forced to wait for any work to
> > complete. If the ->release of ->destroy_inode function is slow for any
> > reason, this can result in nfsd closing files more quickly than the
> > workqueue can complete the close and the queue of pending closes can
> > grow without bounces (30 million has been seen at one customer site,
> > though this was in part due to a slowness in xfs which has since been
> > fixed).
> >
> > nfsd does not need this.
>
> That is technically true, but IIUC, there is only one case where a
> synchronous close matters for the backlog problem, and that's when
> nfsd_file_free() is called from nfsd_file_put(). AFAICT all other
> call sites (except rename) are error paths, so there aren't negative
> consequences for the lack of synchronous wait there...
What you say is technically true but it isn't the way I see it.
Firstly I should clarify that __fput_sync() is *not* a flushing close as
you describe it below.
All it does, apart for some trivial book-keeping, is to call ->release
and possibly ->destroy_inode immediately rather than shunting them off
to another thread.
Apparently ->release sometimes does something that can deadlock with
some kernel threads or if some awkward locks are held, so the whole
final __fput is delay by default. But this does not apply to nfsd.
Standard fput() is really the wrong interface for nfsd to use.
It should use __fput_sync() (which shouldn't have such a scary name).
The comment above flush_delayed_fput() seems to suggest that unmounting
is a core issue. Maybe the fact that __fput() can call
dissolve_on_fput() is a reason why it is sometimes safer to leave the
work to later. But I don't see that applying to nfsd.
Of course a ->release function *could* do synchronous writes just like
the XFS ->destroy_inode function used to do synchronous reads.
I don't think we should ever try to hide that by putting it in
a workqueue. It's probably a bug and it is best if bugs are visible.
Note that the XFS ->release function does call filemap_flush() in some
cases, but that is an async flush, so __fput_sync doesn't wait for the
flush to complete.
The way I see this patch is that fput() is the wrong interface for nfsd
to use, __fput_sync is the right interface. So we should change. 1
patch.
The details about exhausting memory explain a particular symptom that
motivated the examination which revealed that nfsd was using the wrong
interface.
If we have nfsd sometimes using fput() and sometimes __fput_sync, then
we need to have clear rules for when to use which. It is much easier to
have a simple rule: always use __fput_sync().
I'm certainly happy to revise function documentation and provide
wrapper functions if needed.
I might be good to have
void filp_close_sync(struct file *f)
{
get_file(f);
filp_close(f);
__fput_sync(f);
}
but as that would only be called once, it was hard to motivate.
Having it in linux/fs.h would be nice.
Similarly would could wrap __fput_sync() is a more friendly name, but
that would be better if we actually renamed the function.
void fput_now(struct file *f)
{
__fput_sync(f);
}
??
Thanks,
NeilBrown
Powered by blists - more mailing lists