linux-kernel - Re: [PATCH 1/3] nfsd: use __fput_sync() to avoid delayed closing of files.

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20231211231330.GE1674809@ZenIV>
Date:   Mon, 11 Dec 2023 23:13:30 +0000
From:   Al Viro <viro@...iv.linux.org.uk>
To:     NeilBrown <neilb@...e.de>
Cc:     Chuck Lever <chuck.lever@...cle.com>,
        Christian Brauner <brauner@...nel.org>,
        Jens Axboe <axboe@...nel.dk>, Oleg Nesterov <oleg@...hat.com>,
        Jeff Layton <jlayton@...nel.org>,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-nfs@...r.kernel.org
Subject: Re: [PATCH 1/3] nfsd: use __fput_sync() to avoid delayed closing of
 files.

On Tue, Dec 12, 2023 at 09:23:51AM +1100, NeilBrown wrote:

> Previously you've suggested problems with ->release blocking.
> Now you refer to lazy-umount, which is what the comment above
> __fput_sync() mentions.

Yes?  What I'm saying is that the set of locks involved is
too large for any sane analysis.  And lest you discard ->release(),
that brings ->i_rwsem, and thus anything that might be grabbed
under that.  Someone's ->mmap_lock, for example.

> "pretty much an locks" seems like hyperbole.  I don't see it taking
> nfsd_mutex or nlmsvc_mutex.

I don't know - and I can't tell without serious search.  What I can
tell is that before making fput() delayed we used to find deadlocks
on regular basis; that was a massive source of headache.

> Maybe you mean any filesystem lock?

Don't forget VM.  And drivers.  And there was quite a bit of fun
happening in net/unix, etc.  Sure, in case of nfsd the last two
_probably_ won't occur - not directly, anyway.

But making it a general nuisan^Wfacility is asking for trouble.

> My understanding is that the advent of vmalloc allocated stacks means
> that kernel stack space is not an important consideration.
> 
> It would really help if we could have clear documented explanation of
> what problems can occur.  Maybe an example of contexts where it isn't
> safe to call __fput_sync().
> 
> I can easily see that lazy-unmount is an interesting case which could
> easily catch people unawares.  Punting the tail end of mntput_no_expire
> (i.e.  if count reaches zero) to a workqueue/task_work makes sense and
> would be much less impact than punting every __fput to a workqueue.
> 
> Would that make an fput_now() call safe to use in most contexts, or is
> there something about ->release or dentry_kill() that can still cause
> problems?

dentry_kill() means ->d_release(), ->d_iput() and anything final iput()
could do.  Including e.g. anything that might be done by afs_silly_iput(),
with its "send REMOVE to server, wait for completion".  No, that's not
a deadlock per se, but it can stall you a bit more than you would
probably consider tolerable...  Sure, you could argue that AFS ought to
make that thing asynchronous, but...

Anyway, it won't be "safe to use in most contexts".  ->mmap_lock alone
is enough for that, and that's just the one I remember to have given
us a lot of headache.  And that's without bringing the "nfsd won't
touch those files" cases - make it generally accessible and you get
to audit all locks that might be taken when we close a socket, etc.