[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <173801616176.22054.7568435634151185350@noble.neil.brown.name>
Date: Tue, 28 Jan 2025 09:16:01 +1100
From: "NeilBrown" <neilb@...e.de>
To: "Chuck Lever" <chuck.lever@...cle.com>
Cc: "Jeff Layton" <jlayton@...nel.org>,
"Olga Kornievskaia" <okorniev@...hat.com>, "Dai Ngo" <Dai.Ngo@...cle.com>,
"Tom Talpey" <tom@...pey.com>, "Salvatore Bonaccorso" <carnil@...ian.org>,
linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject:
Re: [PATCH] nfsd: validate the nfsd_serv pointer before calling svc_wake_up
On Tue, 28 Jan 2025, Chuck Lever wrote:
> On 1/27/25 8:39 AM, Chuck Lever wrote:
> > On 1/27/25 8:32 AM, Jeff Layton wrote:
> >> On Mon, 2025-01-27 at 08:22 -0500, Chuck Lever wrote:
> >>> On 1/27/25 8:07 AM, Jeff Layton wrote:
> >>>> On Mon, 2025-01-27 at 11:15 +1100, NeilBrown wrote:
> >>>>> On Mon, 27 Jan 2025, Jeff Layton wrote:
> >>>>>> On Mon, 2025-01-27 at 08:53 +1100, NeilBrown wrote:
> >>>>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
> >>>>>>>> On Sun, 2025-01-26 at 13:39 +1100, NeilBrown wrote:
> >>>>>>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
> >>>>>>>>>> nfsd_file_dispose_list_delayed can be called from the filecache
> >>>>>>>>>> laundrette, which is shut down after the nfsd threads are shut
> >>>>>>>>>> down and
> >>>>>>>>>> the nfsd_serv pointer is cleared. If nn->nfsd_serv is NULL
> >>>>>>>>>> then there
> >>>>>>>>>> are no threads to wake.
> >>>>>>>>>>
> >>>>>>>>>> Ensure that the nn->nfsd_serv pointer is non-NULL before calling
> >>>>>>>>>> svc_wake_up in nfsd_file_dispose_list_delayed. This is safe
> >>>>>>>>>> since the
> >>>>>>>>>> svc_serv is not freed until after the filecache laundrette is
> >>>>>>>>>> cancelled.
> >>>>>>>>>>
> >>>>>>>>>> Fixes: ffb402596147 ("nfsd: Don't leave work of closing files
> >>>>>>>>>> to a work queue")
> >>>>>>>>>> Reported-by: Salvatore Bonaccorso <carnil@...ian.org>
> >>>>>>>>>> Closes: https://lore.kernel.org/linux-
> >>>>>>>>>> nfs/7d9f2a8aede4f7ca9935a47e1d405643220d7946.camel@...nel.org/
> >>>>>>>>>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
> >>>>>>>>>> ---
> >>>>>>>>>> This is only lightly tested, but I think it will fix the bug that
> >>>>>>>>>> Salvatore reported.
> >>>>>>>>>> ---
> >>>>>>>>>> fs/nfsd/filecache.c | 11 ++++++++++-
> >>>>>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
> >>>>>>>>>>
> >>>>>>>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
> >>>>>>>>>> index
> >>>>>>>>>> e91c164b5ea21507659904690533a19ca43b1b64..fb2a4469b7a3c077de2dd750f43239b4af6d37b0 100644
> >>>>>>>>>> --- a/fs/nfsd/filecache.c
> >>>>>>>>>> +++ b/fs/nfsd/filecache.c
> >>>>>>>>>> @@ -445,11 +445,20 @@ nfsd_file_dispose_list_delayed(struct
> >>>>>>>>>> list_head *dispose)
> >>>>>>>>>> struct nfsd_file, nf_gc);
> >>>>>>>>>> struct nfsd_net *nn = net_generic(nf->nf_net,
> >>>>>>>>>> nfsd_net_id);
> >>>>>>>>>> struct nfsd_fcache_disposal *l = nn->fcache_disposal;
> >>>>>>>>>> + struct svc_serv *serv;
> >>>>>>>>>> spin_lock(&l->lock);
> >>>>>>>>>> list_move_tail(&nf->nf_gc, &l->freeme);
> >>>>>>>>>> spin_unlock(&l->lock);
> >>>>>>>>>> - svc_wake_up(nn->nfsd_serv);
> >>>>>>>>>> +
> >>>>>>>>>> + /*
> >>>>>>>>>> + * The filecache laundrette is shut down after the
> >>>>>>>>>> + * nn->nfsd_serv pointer is cleared, but before the
> >>>>>>>>>> + * svc_serv is freed.
> >>>>>>>>>> + */
> >>>>>>>>>> + serv = nn->nfsd_serv;
> >>>>>>>>>
> >>>>>>>>> I wonder if this should be READ_ONCE() to tell the compiler
> >>>>>>>>> that we
> >>>>>>>>> could race with clearing nn->nfsd_serv. Would the comment
> >>>>>>>>> still be
> >>>>>>>>> needed?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> I think we need a comment at least. The linkage between the
> >>>>>>>> laundrette
> >>>>>>>> and the nfsd_serv being set to NULL is very subtle. A READ_ONCE()
> >>>>>>>> doesn't convey that well, and is unnecessary here.
> >>>>>>>
> >>>>>>> Why do you say "is unnecessary here" ?
> >>>>>>> If the code were
> >>>>>>> if (nn->nfsd_serv)
> >>>>>>> svc_wake_up(nn->nfsd_serv);
> >>>>>>> that would be wrong as nn->nfds_serv could be set to NULL between
> >>>>>>> the
> >>>>>>> two.
> >>>>>>> And the C compile is allowed to load the value twice because the
> >>>>>>> C memory
> >>>>>>> model declares that would have the same effect.
> >>>>>>> While I doubt it would actually change how the code is compiled,
> >>>>>>> I think
> >>>>>>> we should have READ_ONCE() here (and I've been wrong before about
> >>>>>>> what
> >>>>>>> the compiler will actually do).
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>> It's unnecessary because the outcome of either case is acceptable.
> >>>>>>
> >>>>>> When racing with shutdown, either it's NULL and the laundrette won't
> >>>>>> call svc_wake_up(), or it's non-NULL and it will. In the non-NULL
> >>>>>> case,
> >>>>>> the call to svc_wake_up() will be a no-op because the threads are
> >>>>>> shut
> >>>>>> down.
> >>>>>>
> >>>>>> The vastly common case in this code is that this pointer will be non-
> >>>>>> NULL, because the server is running (i.e. not racing with
> >>>>>> shutdown). I
> >>>>>> don't see the need in making all of those accesses volatile.
> >>>>>
> >>>>> One of us is confused. I hope it isn't me.
> >>>>>
> >>>>
> >>>> It's probably me. I think you have a much better understanding of
> >>>> compiler design than I do. Still...
> >>>>
> >>>>> The hypothetical problem I see is that the C compiler could generate
> >>>>> code to load the value "nn->nfsd_serv" twice. The first time it is
> >>>>> not
> >>>>> NULL, the second time it is NULL.
> >>>>> The first is used for the test, the second is passed to svc_wake_up().
> >>>>>
> >>>>> Unlikely though this is, it is possible and READ_ONCE() is designed
> >>>>> precisely to prevent this.
> >>>>> To quote from include/asm-generic/rwonce.h it will
> >>>>> "Prevent the compiler from merging or refetching reads"
> >>>>>
> >>>>> A "volatile" access does not add any cost (in this case). What it
> >>>>> does
> >>>>> is break any aliasing that the compile might have deduced.
> >>>>> Even if the compiler thinks it has "nn->nfsd_serv" in a register, it
> >>>>> won't think it has the result of READ_ONCE(nn->nfsd_serv) in that
> >>>>> register.
> >>>>> And if it needs the result of a previous READ_ONCE(nn->nfsd_serv) it
> >>>>> won't decide that it can just read nn->nfsd_serv again. It MUST keep
> >>>>> the result of READ_ONCE(nn->nfsd_serv) somewhere until it is not
> >>>>> needed
> >>>>> any more.
> >>>>
> >>>> I'm mainly just considering the resulting pointer. There are two
> >>>> possible outcomes to the fetch of nn->nfsd_serv. Either it's a valid
> >>>> pointer that points to the svc_serv, or it's NULL. The resulting code
> >>>> can handle either case, so it doesn't seem like adding READ_ONCE() will
> >>>> create any material difference here.
> >>>>
> >>>> Maybe I should ask it this way: What bad outcome could result if we
> >>>> don't add READ_ONCE() here?
> >>>
> >>> Neil just described it. The compiler would generate two load operations,
> >>> one for the test and one for the function call argument. The first load
> >>> can retrieve a non-NULL address, and the second a NULL address.
> >>>
> >>> I agree a READ_ONCE() is necessary.
> >>>
> >>>
> >>
> >> Now I'm confused:
> >>
> >> struct svc_serv *serv;
> >>
> >> [...]
> >>
> >> /*
> >> * The filecache laundrette is shut down after the
> >> * nn->nfsd_serv pointer is cleared, but before the
> >> * svc_serv is freed.
> >> */
> >> serv = nn->nfsd_serv;
> >> if (serv)
> >> svc_wake_up(serv);
> >>
> >> This code is explicitly asking to fetch nn->nfsd_serv into the serv
> >> variable, and then is testing that copy of the pointer and passing it
> >> into svc_wake_up().
> >>
> >> How is the compiler allowed to suddenly refetch a NULL pointer into
> >> serv after testing that serv is non-NULL?
> >
> > There's nothing here that tells the compiler it is not allowed to
> > optimize this into two separate fetches if it feels that is better
> > code. READ_ONCE is what tells the compiler we do not want that re-
> > organization /ever/.
> >
> >
>
> Well, I think you can argue that even if the compiler does split this
> code into two reads of nn->nfsd_serv, it is guaranteed that the read
> value is always the same both times -- I guess that's that the comment
> is arguing, yes?
I don't think that argument can stand. This code is run from the
filecache laundrette which can still be active when nn->nfsd_serv is set
to NULL. That is what the comment says: it is only shutdown *after*
->nfsd_serv is cleared.
NeilBrown
Powered by blists - more mailing lists