linux-kernel - Re: [PATCH RFC 0/2] nfsd: issue POSIX_FADV

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-id: <175161448993.565058.9950023351466537449@noble.neil.brown.name>
Date: Fri, 04 Jul 2025 17:34:49 +1000
From: "NeilBrown" <neil@...wn.name>
To: "Chuck Lever" <chuck.lever@...cle.com>
Cc: "Jeff Layton" <jlayton@...nel.org>, "Trond Myklebust" <trondmy@...nel.org>,
 "Anna Schumaker" <anna@...nel.org>, "Olga Kornievskaia" <okorniev@...hat.com>,
 "Dai Ngo" <Dai.Ngo@...cle.com>, "Tom Talpey" <tom@...pey.com>,
 "Mike Snitzer" <snitzer@...nel.org>, linux-nfs@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject:
 Re: [PATCH RFC 0/2] nfsd: issue POSIX_FADV_DONTNEED after READ/WRITE/COMMIT

On Fri, 04 Jul 2025, Chuck Lever wrote:
> On 7/3/25 7:16 PM, NeilBrown wrote:
> > On Fri, 04 Jul 2025, Jeff Layton wrote:
> >> Chuck and I were discussing RWF_DONTCACHE and he suggested that this
> >> might be an alternate approach. My main gripe with DONTCACHE was that it
> >> kicks off writeback after every WRITE operation. With NFS, we generally
> >> get a COMMIT operation at some point. Allowing us to batch up writes
> >> until that point has traditionally been considered better for
> >> performance.
> > 
> > I wonder if that traditional consideration is justified, give your
> > subsequent results.  The addition of COMMIT in v3 allowed us to both:
> >  - delay kicking off writes
> >  - not wait for writes to complete
> > 
> > I think the second was always primary.  Maybe we didn't consider the
> > value of the first enough.
> > Obviously the client caches writes and delays the start of writeback.
> > Adding another delay on the serve side does not seem to have a clear
> > justification.  Maybe we *should* kick-off writeback immediately.  There
> > would still be opportunity for subsequent WRITE requests to be merged
> > into the writeback queue.
> 
> Dave Chinner had the same thought a while back. So I've experimented
> with starting writes as part of nfsd_write(). Kicking off writes,
> even without waiting, is actually pretty costly, and it resulted in
> worse performance.

Was this with filemap_fdatawrite_range_kick() or something else?

> 
> Now that .pc_release is called /after/ the WRITE response has been sent,
> though, that might be a place where kicking off writeback could be done
> without charging that latency to the client.

Certainly after is better.  I was imagining kicking the writeback
threads, but they seems to only do whole filesystems.  Maybe that is OK?

I doubt we want more than one thread kicking off write for any given
inode at a time.  Maybe it would be best to add a "kicking_write" flag
in the filecache and if it is already set, then just add the range to
some range of pending-writes.  Maybe.
I guess I should explore how to test :-)

NeilBrown