[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1cd4d07f7afbd7322a1330a49a2cc24e8ff801cd.camel@kernel.org>
Date: Thu, 12 Jun 2025 12:15:05 -0400
From: Jeff Layton <jlayton@...nel.org>
To: Chuck Lever <chuck.lever@...cle.com>, Neil Brown <neilb@...e.de>, Olga
Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>, Tom
Talpey <tom@...pey.com>, Steven Rostedt <rostedt@...dmis.org>, Masami
Hiramatsu <mhiramat@...nel.org>, Mathieu Desnoyers
<mathieu.desnoyers@...icios.com>, Trond Myklebust <trondmy@...nel.org>,
Anna Schumaker <anna@...nel.org>, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>
Cc: Mike Snitzer <snitzer@...nel.org>, linux-nfs@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-trace-kernel@...r.kernel.org,
netdev@...r.kernel.org
Subject: Re: [PATCH 1/2] nfsd: use threads array as-is in netlink interface
On Thu, 2025-06-12 at 12:05 -0400, Chuck Lever wrote:
> On 6/12/25 11:57 AM, Jeff Layton wrote:
> > On Tue, 2025-05-27 at 20:12 -0400, Jeff Layton wrote:
> > > The old nfsdfs interface for starting a server with multiple pools
> > > handles the special case of a single entry array passed down from
> > > userland by distributing the threads over every NUMA node.
> > >
> > > The netlink control interface however constructs an array of length
> > > nfsd_nrpools() and fills any unprovided slots with 0's. This behavior
> > > defeats the special casing that the old interface relies on.
> > >
> > > Change nfsd_nl_threads_set_doit() to pass down the array from userland
> > > as-is.
> > >
> > > Fixes: 7f5c330b2620 ("nfsd: allow passing in array of thread counts via netlink")
> > > Reported-by: Mike Snitzer <snitzer@...nel.org>
> > > Closes: https://lore.kernel.org/linux-nfs/aDC-ftnzhJAlwqwh@kernel.org/
> > > Signed-off-by: Jeff Layton <jlayton@...nel.org>
> > > ---
> > > fs/nfsd/nfsctl.c | 5 ++---
> > > 1 file changed, 2 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > > index ac265d6fde35df4e02b955050f5b0ef22e6e519c..22101e08c3e80350668e94c395058bc228b08e64 100644
> > > --- a/fs/nfsd/nfsctl.c
> > > +++ b/fs/nfsd/nfsctl.c
> > > @@ -1611,7 +1611,7 @@ int nfsd_nl_rpc_status_get_dumpit(struct sk_buff *skb,
> > > */
> > > int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > {
> > > - int *nthreads, count = 0, nrpools, i, ret = -EOPNOTSUPP, rem;
> > > + int *nthreads, nrpools = 0, i, ret = -EOPNOTSUPP, rem;
> > > struct net *net = genl_info_net(info);
> > > struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> > > const struct nlattr *attr;
> > > @@ -1623,12 +1623,11 @@ int nfsd_nl_threads_set_doit(struct sk_buff *skb, struct genl_info *info)
> > > /* count number of SERVER_THREADS values */
> > > nlmsg_for_each_attr(attr, info->nlhdr, GENL_HDRLEN, rem) {
> > > if (nla_type(attr) == NFSD_A_SERVER_THREADS)
> > > - count++;
> > > + nrpools++;
> > > }
> > >
> > > mutex_lock(&nfsd_mutex);
> > >
> > > - nrpools = max(count, nfsd_nrpools(net));
> > > nthreads = kcalloc(nrpools, sizeof(int), GFP_KERNEL);
> > > if (!nthreads) {
> > > ret = -ENOMEM;
> >
> > I noticed that this didn't go in to the recent merge window.
> >
> > This patch fixes a rather nasty regression when you try to start the
> > server on a NUMA-capable box.
>
> The NFSD netlink interface is not broadly used yet, is it?
>
It is. RHEL10 shipped with it, for instance and it's been in Fedora for
a while.
> Since this one came in late during the 6.16 dev cycle and the Fixes: tag
> references a commit that is already in released kernels, I put in the
> "next merge window" pile. On it's own it doesn't look urgent to me.
>
I'd really like to see this go in soon and to stable. If you want me to
respin the changelog, I can. It's not a crash, but it manifests as lost
RPCs that just hang. It took me quite a while to figure out what was
going on, and I'd prefer that we not put users through that.
>
> > It all looks like it works, but some RPCs
> > get silently dropped on the floor (if they happen to be received into a
> > node with no threads). It took me a while to track down the problem
> > after Mike reported it.
> >
> > Can we go ahead and pull this in and send it to stable?
> >
> > Also, did this patch fix the problem for you, Mike?
>
> I'll wait for confirmation.
--
Jeff Layton <jlayton@...nel.org>
Powered by blists - more mailing lists