[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <38f1974c-f487-49b0-9447-74ed2db6ca7e@oracle.com>
Date: Thu, 12 Jun 2025 12:42:32 -0400
From: Chuck Lever <chuck.lever@...cle.com>
To: Jeff Layton <jlayton@...nel.org>, Neil Brown <neilb@...e.de>,
Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
Tom Talpey <tom@...pey.com>, Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Trond Myklebust <trondmy@...nel.org>, Anna Schumaker <anna@...nel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>
Cc: Mike Snitzer <snitzer@...nel.org>, linux-nfs@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] nfsd: use threads array as-is in netlink interface
On 6/12/25 12:15 PM, Jeff Layton wrote:
> On Thu, 2025-06-12 at 12:05 -0400, Chuck Lever wrote:
>> On 6/12/25 11:57 AM, Jeff Layton wrote:
>>> On Tue, 2025-05-27 at 20:12 -0400, Jeff Layton wrote:
>>>> The old nfsdfs interface for starting a server with multiple pools
>>>> handles the special case of a single entry array passed down from
>>>> userland by distributing the threads over every NUMA node.
>>>>
>>>> The netlink control interface however constructs an array of length
>>>> nfsd_nrpools() and fills any unprovided slots with 0's. This behavior
>>>> defeats the special casing that the old interface relies on.
>>>>
>>>> Change nfsd_nl_threads_set_doit() to pass down the array from userland
>>>> as-is.
>>>>
>>>> Fixes: 7f5c330b2620 ("nfsd: allow passing in array of thread counts via netlink")
>>>> Reported-by: Mike Snitzer <snitzer@...nel.org>
>>>> Closes: https://lore.kernel.org/linux-nfs/aDC-ftnzhJAlwqwh@kernel.org/
>>>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
>>>> ---
>>>> fs/nfsd/nfsctl.c | 5 ++---
>>>> 1 file changed, 2 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
>>>> index ac265d6fde35df4e02b955050f5b0ef22e6e519c..22101e08c3e80350668e94c395058bc228b08e64 100644
>>>> --- a/fs/nfsd/nfsctl.c
>>>> +++ b/fs/nfsd/nfsctl.c
>>>> @@ -1611,7 +1611,7 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
>>>> */
>>>> int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
>>>> {
>>>> - int *nthreads, count = 0, nrpools, i, ret = -EOPNOTSUPP, rem;
>>>> + int *nthreads, nrpools = 0, i, ret = -EOPNOTSUPP, rem;
>>>> struct net *net = genl_info_net(info);
>>>> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>>>> const struct nlattr *attr;
>>>> @@ -1623,12 +1623,11 @@ int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
>>>> /* count number of SERVER_THREADS values */
>>>> nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
>>>> if (nla_type(attr) == NFSD_A_SERVER_THREADS)
>>>> - count++;
>>>> + nrpools++;
>>>> }
>>>>
>>>> mutex_lock(&nfsd_mutex);
>>>>
>>>> - nrpools = max(count, nfsd_nrpools(net));
>>>> nthreads = kcalloc(nrpools, sizeof(int), GFP_KERNEL);
>>>> if (!nthreads) {
>>>> ret = -ENOMEM;
>>>
>>> I noticed that this didn't go in to the recent merge window.
>>>
>>> This patch fixes a rather nasty regression when you try to start the
>>> server on a NUMA-capable box.
>>
>> The NFSD netlink interface is not broadly used yet, is it?
>>
>
> It is. RHEL10 shipped with it, for instance and it's been in Fedora for
> a while.
RHEL 10 is shiny and new, and Fedora is bleeding edge. It's not likely
either of these are deployed in production environments yet. Just sayin
that in this case, the Bayesian filter leans towards waiting a full dev
cycle.
>> Since this one came in late during the 6.16 dev cycle and the Fixes: tag
>> references a commit that is already in released kernels, I put in the
>> "next merge window" pile. On it's own it doesn't look urgent to me.
>>
>
> I'd really like to see this go in soon and to stable. If you want me to
> respin the changelog, I can. It's not a crash, but it manifests as lost
> RPCs that just hang. It took me quite a while to figure out what was
> going on, and I'd prefer that we not put users through that.
If someone can confirm that it is effective, I'll add it to nfsd-fixes.
--
Chuck Lever
Powered by blists - more mailing lists