[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zgg9OzeFZUTc4hck@tissot.1015granger.net>
Date: Sat, 30 Mar 2024 12:26:35 -0400
From: Chuck Lever <chuck.lever@...cle.com>
To: Jan Schunk <scpcom@....de>
Cc: Benjamin Coddington <bcodding@...hat.com>,
Jeff Layton <jlayton@...nel.org>, Neil Brown <neilb@...e.de>,
Olga Kornievskaia <kolga@...app.com>, Dai Ngo <dai.ngo@...cle.com>,
Tom Talpey <tom@...pey.com>,
Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
David Howells <dhowells@...hat.com>
Subject: Re: Re: [External] : nfsd: memory leak when client does many file
operations
On Sat, Mar 30, 2024 at 04:26:09PM +0100, Jan Schunk wrote:
> Full test result:
>
> $ git bisect start v6.6 v6.5
> Bisecting: 7882 revisions left to test after this (roughly 13 steps)
> [a1c19328a160c80251868dbd80066dce23d07995] Merge tag 'soc-arm-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> --
> $ git bisect good
> Bisecting: 3935 revisions left to test after this (roughly 12 steps)
> [e4f1b8202fb59c56a3de7642d50326923670513f] Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
> --
> $ git bisect bad
> Bisecting: 2014 revisions left to test after this (roughly 11 steps)
> [e0152e7481c6c63764d6ea8ee41af5cf9dfac5e9] Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
> --
> $ git bisect bad
> Bisecting: 975 revisions left to test after this (roughly 10 steps)
> [4a3b1007eeb26b2bb7ae4d734cc8577463325165] Merge tag 'pinctrl-v6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
> --
> $ git bisect good
> Bisecting: 476 revisions left to test after this (roughly 9 steps)
> [4debf77169ee459c46ec70e13dc503bc25efd7d2] Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
> --
> $ git bisect good
> Bisecting: 237 revisions left to test after this (roughly 8 steps)
> [e7e9423db459423d3dcb367217553ad9ededadc9] Merge tag 'v6.6-vfs.super.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
> --
> $ git bisect good
> Bisecting: 141 revisions left to test after this (roughly 7 steps)
> [8ae5d298ef2005da5454fc1680f983e85d3e1622] Merge tag '6.6-rc-ksmbd-fixes-part1' of git://git.samba.org/ksmbd
> --
> $ git bisect good
> Bisecting: 61 revisions left to test after this (roughly 6 steps)
> [99d99825fc075fd24b60cc9cf0fb1e20b9c16b0f] Merge tag 'nfs-for-6.6-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
> --
> $ git bisect bad
> Bisecting: 39 revisions left to test after this (roughly 5 steps)
> [7b719e2bf342a59e88b2b6215b98ca4cf824bc58] SUNRPC: change svc_recv() to return void.
> --
> $ git bisect bad
> Bisecting: 19 revisions left to test after this (roughly 4 steps)
> [e7421ce71437ec8e4d69cc6bdf35b6853adc5050] NFSD: Rename struct svc_cacherep
> --
> $ git bisect good
> Bisecting: 9 revisions left to test after this (roughly 3 steps)
> [baabf59c24145612e4a975f459a5024389f13f5d] SUNRPC: Convert svc_udp_sendto() to use the per-socket bio_vec array
> --
> $ git bisect bad
> Bisecting: 4 revisions left to test after this (roughly 2 steps)
> [be2be5f7f4436442d8f6bffbb97a6f438df2896b] lockd: nlm_blocked list race fixes
> --
> $ git bisect good
> Bisecting: 2 revisions left to test after this (roughly 1 step)
> [d424797032c6e24b44037e6c7a2d32fd958300f0] nfsd: inherit required unset default acls from effective set
> --
> $ git bisect good
> Bisecting: 0 revisions left to test after this (roughly 1 step)
> [e18e157bb5c8c1cd8a9ba25acfdcf4f3035836f4] SUNRPC: Send RPC message on TCP with a single sock_sendmsg() call
> --
> $ git bisect bad
> Bisecting: 0 revisions left to test after this (roughly 0 steps)
> [2eb2b93581813b74c7174961126f6ec38eadb5a7] SUNRPC: Convert svc_tcp_sendmsg to use bio_vecs directly
> --
> $ git bisect good
> e18e157bb5c8c1cd8a9ba25acfdcf4f3035836f4 is the first bad commit
> commit e18e157bb5c8c1cd8a9ba25acfdcf4f3035836f4
This is a plausible bisect result for this behavior, so nice work.
David (cc'd), can you have a brief look at this? What did we miss?
I'm guessing it's a page reference count issue that might occur
only when the XDR head and tail buffers are in the same page. Or
it might occur if two entries in the XDR page array point to the
same page...?
/me stabs in the darkness
> I found the memory loss inside /proc/meminfo only on MemAvailable
> MemTotal: 346948 kB
> On a bad test run in looks like this:
> -MemAvailable: 210820 kB
> +MemAvailable: 26608 kB
> On a good test run it looks like this:
> -MemAvailable: 215872 kB
> +MemAvailable: 221128 kB
Powered by blists - more mailing lists