lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <0b5fdd56-d570-c787-cd56-7e6d0ba65225@molgen.mpg.de>
Date:   Tue, 2 Jul 2019 23:59:48 +0200
From:   Paul Menzel <pmenzel@...gen.mpg.de>
To:     "J. Bruce Fields" <bfields@...hat.com>,
        Jeff Layton <jlayton@...nel.org>
Cc:     Chris Tracy <ctracy@...r.scu.edu>, linux-nfs@...r.kernel.org,
        LKML <linux-kernel@...r.kernel.org>, it+linux-nfs@...gen.mpg.de
Subject: Regression caused by commit c54f24e3 (nfsd: fix performance-limiting
 session calculation)

Dear Bruce,


Could it be that commit c54f24e3 (nfsd: fix performance-limiting session 
calculation) causes a regression on big memory machines (1 TB)?

> From c54f24e338ed2a35218f117a4a1afb5f9e2b4e64 Mon Sep 17 00:00:00 2001
> From: "J. Bruce Fields" <bfields@...hat.com>
> Date: Thu, 21 Feb 2019 10:47:00 -0500
> Subject: [PATCH] nfsd: fix performance-limiting session calculation
> 
> We're unintentionally limiting the number of slots per nfsv4.1 session
> to 10.  Often more than 10 simultaneous RPCs are needed for the best
> performance.
> 
> This calculation was meant to prevent any one client from using up more
> than a third of the limit we set for total memory use across all clients
> and sessions.  Instead, it's limiting the client to a third of the
> maximum for a single session.
> 
> Fix this.
> 
> Reported-by: Chris Tracy <ctracy@...r.scu.edu>
> Cc: stable@...r.kernel.org
> Fixes: de766e570413 "nfsd: give out fewer session slots as limit approaches"
> Signed-off-by: J. Bruce Fields <bfields@...hat.com>
> ---
>  fs/nfsd/nfs4state.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c
> index fb3c9844c82a..6a45fb00c5fc 100644
> --- a/fs/nfsd/nfs4state.c
> +++ b/fs/nfsd/nfs4state.c
> @@ -1544,16 +1544,16 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca)
>  {
>  	u32 slotsize = slot_bytes(ca);
>  	u32 num = ca->maxreqs;
> -	int avail;
> +	unsigned long avail, total_avail;
>  
>  	spin_lock(&nfsd_drc_lock);
> -	avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION,
> -		    nfsd_drc_max_mem - nfsd_drc_mem_used);
> +	total_avail = nfsd_drc_max_mem - nfsd_drc_mem_used;
> +	avail = min((unsigned long)NFSD_MAX_MEM_PER_SESSION, total_avail);
>  	/*
>  	 * Never use more than a third of the remaining memory,
>  	 * unless it's the only way to give this client a slot:
>  	 */
> -	avail = clamp_t(int, avail, slotsize, avail/3);
> +	avail = clamp_t(int, avail, slotsize, total_avail/3);
>  	num = min_t(int, num, avail / slotsize);
>  	nfsd_drc_mem_used += num * slotsize;
>  	spin_unlock(&nfsd_drc_lock);

Booting a 80 threads, 1 TB server with Linux 4.19.56 and Linux 5.2-rc7 
causes connections problems for the clients. The problems do not happen 
on servers with just 96 GB memory for example. Bisecting points to the 
two commits below (and I can only continue tomorrow).

c54f24e338ed2a35218f117a4a1afb5f9e2b4e64 (nfsd: fix performance-limiting 
session calculation)
8127d82705998568b52ac724e28e00941538083d (NFS: Don't recoalesce on error 
in nfs_pageio_complete_mirror())

If you have things I could do to verify this besides reverting it
tomorrow, please tell. It’d be great if it could be fixed before Linux
5.2 is released.


Kind regards,

Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ