lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <38f1974c-f487-49b0-9447-74ed2db6ca7e@oracle.com>
Date: Thu, 12 Jun 2025 12:42:32 -0400
From: Chuck Lever <chuck.lever@...cle.com>
To: Jeff Layton <jlayton@...nel.org>, Neil Brown <neilb@...e.de>,
        Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
        Tom Talpey <tom@...pey.com>, Steven Rostedt <rostedt@...dmis.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Trond Myklebust <trondmy@...nel.org>, Anna Schumaker <anna@...nel.org>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>
Cc: Mike Snitzer <snitzer@...nel.org>, linux-nfs@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
        netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] nfsd: use threads array as-is in netlink interface

On 6/12/25 12:15 PM, Jeff Layton wrote:
> On Thu, 2025-06-12 at 12:05 -0400, Chuck Lever wrote:
>> On 6/12/25 11:57 AM, Jeff Layton wrote:
>>> On Tue, 2025-05-27 at 20:12 -0400, Jeff Layton wrote:
>>>> The old nfsdfs interface for starting a server with multiple pools
>>>> handles the special case of a single entry array passed down from
>>>> userland by distributing the threads over every NUMA node.
>>>>
>>>> The netlink control interface however constructs an array of length
>>>> nfsd_nrpools() and fills any unprovided slots with 0's. This behavior
>>>> defeats the special casing that the old interface relies on.
>>>>
>>>> Change nfsd_nl_threads_set_doit() to pass down the array from userland
>>>> as-is.
>>>>
>>>> Fixes: 7f5c330b2620 ("nfsd: allow passing in array of thread counts via netlink")
>>>> Reported-by: Mike Snitzer <snitzer@...nel.org>
>>>> Closes: https://lore.kernel.org/linux-nfs/aDC-ftnzhJAlwqwh@kernel.org/
>>>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
>>>> ---
>>>>  fs/nfsd/nfsctl.c | 5 ++---
>>>>  1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
>>>> index ac265d6fde35df4e02b955050f5b0ef22e6e519c..22101e08c3e80350668e94c395058bc228b08e64 100644
>>>> --- a/fs/nfsd/nfsctl.c
>>>> +++ b/fs/nfsd/nfsctl.c
>>>> @@ -1611,7 +1611,7 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
>>>>   */
>>>>  int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
>>>>  {
>>>> -	int *nthreads, count = 0, nrpools, i, ret = -EOPNOTSUPP, rem;
>>>> +	int *nthreads, nrpools = 0, i, ret = -EOPNOTSUPP, rem;
>>>>  	struct net *net = genl_info_net(info);
>>>>  	struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>>>>  	const struct nlattr *attr;
>>>> @@ -1623,12 +1623,11 @@ int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
>>>>  	/* count number of SERVER_THREADS values */
>>>>  	nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
>>>>  		if (nla_type(attr) == NFSD_A_SERVER_THREADS)
>>>> -			count++;
>>>> +			nrpools++;
>>>>  	}
>>>>  
>>>>  	mutex_lock(&nfsd_mutex);
>>>>  
>>>> -	nrpools = max(count, nfsd_nrpools(net));
>>>>  	nthreads = kcalloc(nrpools, sizeof(int), GFP_KERNEL);
>>>>  	if (!nthreads) {
>>>>  		ret = -ENOMEM;
>>>
>>> I noticed that this didn't go in to the recent merge window.
>>>
>>> This patch fixes a rather nasty regression when you try to start the
>>> server on a NUMA-capable box.
>>
>> The NFSD netlink interface is not broadly used yet, is it?
>>
> 
> It is. RHEL10 shipped with it, for instance and it's been in Fedora for
> a while.

RHEL 10 is shiny and new, and Fedora is bleeding edge. It's not likely
either of these are deployed in production environments yet. Just sayin
that in this case, the Bayesian filter leans towards waiting a full dev
cycle.


>> Since this one came in late during the 6.16 dev cycle and the Fixes: tag
>> references a commit that is already in released kernels, I put in the
>> "next merge window" pile. On it's own it doesn't look urgent to me.
>>
> 
> I'd really like to see this go in soon and to stable. If you want me to
> respin the changelog, I can. It's not a crash, but it manifests as lost
> RPCs that just hang. It took me quite a while to figure out what was
> going on, and I'd prefer that we not put users through that.

If someone can confirm that it is effective, I'll add it to nfsd-fixes.


-- 
Chuck Lever

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ