linux-kernel - Re: [PATCH] mm/page-writeback: Consolidate wb_thresh bumping logic into __wb_calc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJdd5GaQ1LdS=n52AWQwZ=Q9woSjFYiVD9E_1SkEeDPoT=bmjw@mail.gmail.com>
Date: Wed, 8 Oct 2025 17:14:31 -0600
From: Joshua Watt <jpewhacker@...il.com>
To: Jan Kara <jack@...e.cz>
Cc: jimzhao.ai@...il.com, akpm@...ux-foundation.org, 
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org, 
	linux-mm@...ck.org, willy@...radead.org, linux-nfs@...r.kernel.org
Subject: Re: [PATCH] mm/page-writeback: Consolidate wb_thresh bumping logic
 into __wb_calc_thresh

On Wed, Oct 8, 2025 at 8:49 AM Joshua Watt <jpewhacker@...il.com> wrote:
>
> On Wed, Oct 8, 2025 at 5:14 AM Jan Kara <jack@...e.cz> wrote:
> >
> > Hello!
> >
> > On Tue 07-10-25 10:17:11, Joshua Watt wrote:
> > > From: Joshua Watt <jpewhacker@...il.com>
> > >
> > > This patch strangely breaks NFS 4 clients for me. The behavior is that a
> > > client will start getting an I/O error which in turn is caused by the client
> > > getting a NFS3ERR_BADSESSION when attempting to write data to the server. I
> > > bisected the kernel from the latest master
> > > (9029dc666353504ea7c1ebfdf09bc1aab40f6147) to this commit (log below). Also,
> > > when I revert this commit on master the bug disappears.
> > >
> > > The server is running kernel 5.4.161, and the client that exhibits the
> > > behavior is running in qemux86, and has mounted the server with the options
> > > rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,port=52049,timeo=600,retrans=2,sec=null,clientaddr=172.16.6.90,local_lock=none,addr=172.16.6.0
> > >
> > > The program that I wrote to reproduce this is pretty simple; it does a file
> > > lock over NFS, then writes data to the file once per second. After about 32
> > > seconds, it receives the I/O error, and this reproduced every time. I can
> > > provide the sample program if necessary.
> >
> > This is indeed rather curious.
> >
> > > I also captured the NFS traffic both in the passing case and the failure case,
> > > and can provide them if useful.
> > >
> > > I did look at the two dumps and I'm not exactly sure what the difference is,
> > > other than with this patch the client tries to write every 30 seconds (and
> > > fails), where as without it attempts to write back every 5 seconds. I have no
> > > idea why this patch would cause this problem.
> >
> > So the change in writeback behavior is not surprising. The commit does
> > modify the logic computing dirty limits in some corner cases and your
> > description matches the fact that previously the computed limits were lower
> > so we've started writeback after 5s (dirty_writeback_interval) while with
> > the patch we didn't cross the threshold and thus started writeback only
> > once the dirty data was old enough, which is 30s (dirty_expire_interval).
> >
> > But that's all, you should be able to observe exactly the same writeback
> > behavior if you write less even without this patch. So I suspect that the
> > different writeback behavior is just triggering some bug in the NFS (either
> > on the client or the server side). The NFS3ERR_BADSESSION error you're
> > getting back sounds like something times out somewhere, falls out of cache
> > and reports this error (which doesn't happen if we writeback after 5s
> > instead of 30s). NFS guys maybe have better idea what's going on here.
> >
> > You could possibly workaround this problem (and verify my theory) by tuning
> > /proc/sys/vm/dirty_expire_centisecs to a lower value (say 500). This will
> > make inode writeback start earlier and thus should effectively mask the
> > problem again.
>
> Changing /proc/sys/vm/dirty_expire_centisecs did indeed prevent the
> issue from occurring. As an experiment, I tried to see what the lowest
> value I could use that worked, and it was also 500. Even setting it to
> 600 would cause it to error out eventually. This would indicate to me
> a server problem (which is unfortunate because that's much harder for
> me to debug), but perhaps the NFS folks could weigh in.

I figured out the problem. There was a bug in the NFS client where it
would not send state renewals within the first 5 minutes after
booting; prior to this change, that was masked in my test case because
the 5 second dirty writeback interval would keep the connection alive
without needing the state renewals (and my test always did a reboot).
I've submitted a patch to fix the NFS client to the mailing list [1].

Sorry for the noise, and thanks for your help.

[1]: https://lore.kernel.org/linux-nfs/20251008230935.738405-1-JPEWhacker@gmail.com/T/#u
>
> >
> >                                                                 Honza
> > --
> > Jan Kara <jack@...e.com>
> > SUSE Labs, CR