[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <37fb186b-ffb6-44dc-a097-ec669079c801@oracle.com>
Date: Mon, 27 Jan 2025 09:03:38 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: Jeff Layton <jlayton@...nel.org>, NeilBrown <neilb@...e.de>
Cc: Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
Tom Talpey <tom@...pey.com>, Salvatore Bonaccorso <carnil@...ian.org>,
linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nfsd: validate the nfsd_serv pointer before calling
svc_wake_up
On 1/27/25 8:39 AM, Chuck Lever wrote:
> On 1/27/25 8:32 AM, Jeff Layton wrote:
>> On Mon, 2025-01-27 at 08:22 -0500, Chuck Lever wrote:
>>> On 1/27/25 8:07 AM, Jeff Layton wrote:
>>>> On Mon, 2025-01-27 at 11:15 +1100, NeilBrown wrote:
>>>>> On Mon, 27 Jan 2025, Jeff Layton wrote:
>>>>>> On Mon, 2025-01-27 at 08:53 +1100, NeilBrown wrote:
>>>>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
>>>>>>>> On Sun, 2025-01-26 at 13:39 +1100, NeilBrown wrote:
>>>>>>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
>>>>>>>>>> nfsd_file_dispose_list_delayed can be called from the filecache
>>>>>>>>>> laundrette, which is shut down after the nfsd threads are shut
>>>>>>>>>> down and
>>>>>>>>>> the nfsd_serv pointer is cleared. If nn->nfsd_serv is NULL
>>>>>>>>>> then there
>>>>>>>>>> are no threads to wake.
>>>>>>>>>>
>>>>>>>>>> Ensure that the nn->nfsd_serv pointer is non-NULL before calling
>>>>>>>>>> svc_wake_up in nfsd_file_dispose_list_delayed. This is safe
>>>>>>>>>> since the
>>>>>>>>>> svc_serv is not freed until after the filecache laundrette is
>>>>>>>>>> cancelled.
>>>>>>>>>>
>>>>>>>>>> Fixes: ffb402596147 ("nfsd: Don't leave work of closing files
>>>>>>>>>> to a work queue")
>>>>>>>>>> Reported-by: Salvatore Bonaccorso <carnil@...ian.org>
>>>>>>>>>> Closes: https://lore.kernel.org/linux-
>>>>>>>>>> nfs/7d9f2a8aede4f7ca9935a47e1d405643220d7946.camel@...nel.org/
>>>>>>>>>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
>>>>>>>>>> ---
>>>>>>>>>> This is only lightly tested, but I think it will fix the bug that
>>>>>>>>>> Salvatore reported.
>>>>>>>>>> ---
>>>>>>>>>> fs/nfsd/filecache.c | 11 ++++++++++-
>>>>>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>>>>>>>>> index
>>>>>>>>>> e91c164b5ea21507659904690533a19ca43b1b64..fb2a4469b7a3c077de2dd750f43239b4af6d37b0 100644
>>>>>>>>>> --- a/fs/nfsd/filecache.c
>>>>>>>>>> +++ b/fs/nfsd/filecache.c
>>>>>>>>>> @@ -445,11 +445,20 @@ nfsd_file_dispose_list_delayed(struct
>>>>>>>>>> list_head *dispose)
>>>>>>>>>> struct nfsd_file, nf_gc);
>>>>>>>>>> struct nfsd_net *nn = net_generic(nf->nf_net,
>>>>>>>>>> nfsd_net_id);
>>>>>>>>>> struct nfsd_fcache_disposal *l = nn->fcache_disposal;
>>>>>>>>>> + struct svc_serv *serv;
>>>>>>>>>> spin_lock(&l->lock);
>>>>>>>>>> list_move_tail(&nf->nf_gc, &l->freeme);
>>>>>>>>>> spin_unlock(&l->lock);
>>>>>>>>>> - svc_wake_up(nn->nfsd_serv);
>>>>>>>>>> +
>>>>>>>>>> + /*
>>>>>>>>>> + * The filecache laundrette is shut down after the
>>>>>>>>>> + * nn->nfsd_serv pointer is cleared, but before the
>>>>>>>>>> + * svc_serv is freed.
>>>>>>>>>> + */
>>>>>>>>>> + serv = nn->nfsd_serv;
>>>>>>>>>
>>>>>>>>> I wonder if this should be READ_ONCE() to tell the compiler
>>>>>>>>> that we
>>>>>>>>> could race with clearing nn->nfsd_serv. Would the comment
>>>>>>>>> still be
>>>>>>>>> needed?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think we need a comment at least. The linkage between the
>>>>>>>> laundrette
>>>>>>>> and the nfsd_serv being set to NULL is very subtle. A READ_ONCE()
>>>>>>>> doesn't convey that well, and is unnecessary here.
>>>>>>>
>>>>>>> Why do you say "is unnecessary here" ?
>>>>>>> If the code were
>>>>>>> if (nn->nfsd_serv)
>>>>>>> svc_wake_up(nn->nfsd_serv);
>>>>>>> that would be wrong as nn->nfds_serv could be set to NULL between
>>>>>>> the
>>>>>>> two.
>>>>>>> And the C compile is allowed to load the value twice because the
>>>>>>> C memory
>>>>>>> model declares that would have the same effect.
>>>>>>> While I doubt it would actually change how the code is compiled,
>>>>>>> I think
>>>>>>> we should have READ_ONCE() here (and I've been wrong before about
>>>>>>> what
>>>>>>> the compiler will actually do).
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> It's unnecessary because the outcome of either case is acceptable.
>>>>>>
>>>>>> When racing with shutdown, either it's NULL and the laundrette won't
>>>>>> call svc_wake_up(), or it's non-NULL and it will. In the non-NULL
>>>>>> case,
>>>>>> the call to svc_wake_up() will be a no-op because the threads are
>>>>>> shut
>>>>>> down.
>>>>>>
>>>>>> The vastly common case in this code is that this pointer will be non-
>>>>>> NULL, because the server is running (i.e. not racing with
>>>>>> shutdown). I
>>>>>> don't see the need in making all of those accesses volatile.
>>>>>
>>>>> One of us is confused. I hope it isn't me.
>>>>>
>>>>
>>>> It's probably me. I think you have a much better understanding of
>>>> compiler design than I do. Still...
>>>>
>>>>> The hypothetical problem I see is that the C compiler could generate
>>>>> code to load the value "nn->nfsd_serv" twice. The first time it is
>>>>> not
>>>>> NULL, the second time it is NULL.
>>>>> The first is used for the test, the second is passed to svc_wake_up().
>>>>>
>>>>> Unlikely though this is, it is possible and READ_ONCE() is designed
>>>>> precisely to prevent this.
>>>>> To quote from include/asm-generic/rwonce.h it will
>>>>> "Prevent the compiler from merging or refetching reads"
>>>>>
>>>>> A "volatile" access does not add any cost (in this case). What it
>>>>> does
>>>>> is break any aliasing that the compile might have deduced.
>>>>> Even if the compiler thinks it has "nn->nfsd_serv" in a register, it
>>>>> won't think it has the result of READ_ONCE(nn->nfsd_serv) in that
>>>>> register.
>>>>> And if it needs the result of a previous READ_ONCE(nn->nfsd_serv) it
>>>>> won't decide that it can just read nn->nfsd_serv again. It MUST keep
>>>>> the result of READ_ONCE(nn->nfsd_serv) somewhere until it is not
>>>>> needed
>>>>> any more.
>>>>
>>>> I'm mainly just considering the resulting pointer. There are two
>>>> possible outcomes to the fetch of nn->nfsd_serv. Either it's a valid
>>>> pointer that points to the svc_serv, or it's NULL. The resulting code
>>>> can handle either case, so it doesn't seem like adding READ_ONCE() will
>>>> create any material difference here.
>>>>
>>>> Maybe I should ask it this way: What bad outcome could result if we
>>>> don't add READ_ONCE() here?
>>>
>>> Neil just described it. The compiler would generate two load operations,
>>> one for the test and one for the function call argument. The first load
>>> can retrieve a non-NULL address, and the second a NULL address.
>>>
>>> I agree a READ_ONCE() is necessary.
>>>
>>>
>>
>> Now I'm confused:
>>
>> struct svc_serv *serv;
>>
>> [...]
>>
>> /*
>> * The filecache laundrette is shut down after the
>> * nn->nfsd_serv pointer is cleared, but before the
>> * svc_serv is freed.
>> */
>> serv = nn->nfsd_serv;
>> if (serv)
>> svc_wake_up(serv);
>>
>> This code is explicitly asking to fetch nn->nfsd_serv into the serv
>> variable, and then is testing that copy of the pointer and passing it
>> into svc_wake_up().
>>
>> How is the compiler allowed to suddenly refetch a NULL pointer into
>> serv after testing that serv is non-NULL?
>
> There's nothing here that tells the compiler it is not allowed to
> optimize this into two separate fetches if it feels that is better
> code. READ_ONCE is what tells the compiler we do not want that re-
> organization /ever/.
>
>
Well, I think you can argue that even if the compiler does split this
code into two reads of nn->nfsd_serv, it is guaranteed that the read
value is always the same both times -- I guess that's that the comment
is arguing, yes?
I just wonder what will happen if that guarantee should happen to change
in the future.
--
Chuck Lever
Powered by blists - more mailing lists