lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <trinity-00b2c0aa-3284-4d74-8184-71b5374bd8d7-1711992900521@msvc-mesg-gmx021>
Date: Mon, 1 Apr 2024 19:35:00 +0200
From: Jan Schunk <scpcom@....de>
To: Chuck Lever III <chuck.lever@...cle.com>
Cc: Benjamin Coddington <bcodding@...hat.com>, Jeff Layton
 <jlayton@...nel.org>, Neil Brown <neilb@...e.de>, Olga Kornievskaia
 <kolga@...app.com>, Dai Ngo <dai.ngo@...cle.com>, Tom Talpey
 <tom@...pey.com>, Linux NFS Mailing List <linux-nfs@...r.kernel.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, David
 Howells <dhowells@...hat.com>, Linux regressions mailing list
 <regressions@...ts.linux.dev>
Subject: Aw: Re: [External] : nfsd: memory leak when client does many file
 operations

Hi,
the bug report is now here:
https://bugzilla.kernel.org/show_bug.cgi?id=218671

PS: I can also confirm, if you use the latest v6.6.22 and only revert e18e157bb5c8 nfsd works without any issue.

> Gesendet: Montag, den 01.04.2024 um 16:08 Uhr
> Von: "Chuck Lever III" <chuck.lever@...cle.com>
> An: "Jan Schunk" <scpcom@....de>
> Cc: "Benjamin Coddington" <bcodding@...hat.com>, "Jeff Layton" <jlayton@...nel.org>, "Neil Brown" <neilb@...e.de>, "Olga Kornievskaia" <kolga@...app.com>, "Dai Ngo" <dai.ngo@...cle.com>, "Tom Talpey" <tom@...pey.com>, "Linux NFS Mailing List" <linux-nfs@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, "David Howells" <dhowells@...hat.com>, "Linux regressions mailing list" <regressions@...ts.linux.dev>
> Betreff: Re: [External] : nfsd: memory leak when client does many file operations
> 
> 
> 
> > On Mar 30, 2024, at 12:26 PM, Chuck Lever <chuck.lever@...cle.com> wrote:
> > 
> > On Sat, Mar 30, 2024 at 04:26:09PM +0100, Jan Schunk wrote:
> >> Full test result:
> >> 
> >> $ git bisect start v6.6 v6.5
> >> Bisecting: 7882 revisions left to test after this (roughly 13 steps)
> >> [a1c19328a160c80251868dbd80066dce23d07995] Merge tag 'soc-arm-6.6' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
> >> --
> >> $ git bisect good
> >> Bisecting: 3935 revisions left to test after this (roughly 12 steps)
> >> [e4f1b8202fb59c56a3de7642d50326923670513f] Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
> >> --
> >> $ git bisect bad
> >> Bisecting: 2014 revisions left to test after this (roughly 11 steps)
> >> [e0152e7481c6c63764d6ea8ee41af5cf9dfac5e9] Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
> >> --
> >> $ git bisect bad
> >> Bisecting: 975 revisions left to test after this (roughly 10 steps)
> >> [4a3b1007eeb26b2bb7ae4d734cc8577463325165] Merge tag 'pinctrl-v6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
> >> --
> >> $ git bisect good
> >> Bisecting: 476 revisions left to test after this (roughly 9 steps)
> >> [4debf77169ee459c46ec70e13dc503bc25efd7d2] Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd
> >> --
> >> $ git bisect good
> >> Bisecting: 237 revisions left to test after this (roughly 8 steps)
> >> [e7e9423db459423d3dcb367217553ad9ededadc9] Merge tag 'v6.6-vfs.super.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
> >> --
> >> $ git bisect good
> >> Bisecting: 141 revisions left to test after this (roughly 7 steps)
> >> [8ae5d298ef2005da5454fc1680f983e85d3e1622] Merge tag '6.6-rc-ksmbd-fixes-part1' of git://git.samba.org/ksmbd
> >> --
> >> $ git bisect good
> >> Bisecting: 61 revisions left to test after this (roughly 6 steps)
> >> [99d99825fc075fd24b60cc9cf0fb1e20b9c16b0f] Merge tag 'nfs-for-6.6-1' of git://git.linux-nfs.org/projects/anna/linux-nfs
> >> --
> >> $ git bisect bad
> >> Bisecting: 39 revisions left to test after this (roughly 5 steps)
> >> [7b719e2bf342a59e88b2b6215b98ca4cf824bc58] SUNRPC: change svc_recv() to return void.
> >> --
> >> $ git bisect bad
> >> Bisecting: 19 revisions left to test after this (roughly 4 steps)
> >> [e7421ce71437ec8e4d69cc6bdf35b6853adc5050] NFSD: Rename struct svc_cacherep
> >> --
> >> $ git bisect good
> >> Bisecting: 9 revisions left to test after this (roughly 3 steps)
> >> [baabf59c24145612e4a975f459a5024389f13f5d] SUNRPC: Convert svc_udp_sendto() to use the per-socket bio_vec array
> >> --
> >> $ git bisect bad
> >> Bisecting: 4 revisions left to test after this (roughly 2 steps)
> >> [be2be5f7f4436442d8f6bffbb97a6f438df2896b] lockd: nlm_blocked list race fixes
> >> --
> >> $ git bisect good
> >> Bisecting: 2 revisions left to test after this (roughly 1 step)
> >> [d424797032c6e24b44037e6c7a2d32fd958300f0] nfsd: inherit required unset default acls from effective set
> >> --
> >> $ git bisect good
> >> Bisecting: 0 revisions left to test after this (roughly 1 step)
> >> [e18e157bb5c8c1cd8a9ba25acfdcf4f3035836f4] SUNRPC: Send RPC message on TCP with a single sock_sendmsg() call
> >> --
> >> $ git bisect bad
> >> Bisecting: 0 revisions left to test after this (roughly 0 steps)
> >> [2eb2b93581813b74c7174961126f6ec38eadb5a7] SUNRPC: Convert svc_tcp_sendmsg to use bio_vecs directly
> >> --
> >> $ git bisect good
> >> e18e157bb5c8c1cd8a9ba25acfdcf4f3035836f4 is the first bad commit
> >> commit e18e157bb5c8c1cd8a9ba25acfdcf4f3035836f4
> > 
> > This is a plausible bisect result for this behavior, so nice work.
> > 
> > David (cc'd), can you have a brief look at this? What did we miss?
> > I'm guessing it's a page reference count issue that might occur
> > only when the XDR head and tail buffers are in the same page. Or
> > it might occur if two entries in the XDR page array point to the
> > same page...?
> > 
> > /me stabs in the darkness
> > 
> > 
> >> I found the memory loss inside /proc/meminfo only on MemAvailable
> >> MemTotal:         346948 kB
> >> On a bad test run in looks like this:
> >> -MemAvailable:     210820 kB
> >> +MemAvailable:      26608 kB
> >> On a good test run it looks like this:
> >> -MemAvailable:     215872 kB
> >> +MemAvailable:     221128 kB
> 
> Jan, may I ask one more favor? Since this might take a little
> time to run down, can you open a bug report on
> bugzilla.kernel.org <http://bugzilla.kernel.org/>, and copy in the symptomology and the
> bisect results? It will get assigned to Trond, and he can
> pass it to me.
> 
> The problem looks like how we're using a page_frag_cache to
> handle the record marker buffers, but I'm not sure what the
> proper solution is yet.
> 
> #regzbot ^introduced: e18e157bb5c8
> 
> --
> Chuck Lever
> 
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ