lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c70d155c-22e0-43f7-af23-241088317d94@oracle.com>
Date: Mon, 27 Jan 2025 08:22:58 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: Jeff Layton <jlayton@...nel.org>, NeilBrown <neilb@...e.de>
Cc: Olga Kornievskaia <okorniev@...hat.com>, Dai Ngo <Dai.Ngo@...cle.com>,
        Tom Talpey <tom@...pey.com>, Salvatore Bonaccorso <carnil@...ian.org>,
        linux-nfs@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nfsd: validate the nfsd_serv pointer before calling
 svc_wake_up

On 1/27/25 8:07 AM, Jeff Layton wrote:
> On Mon, 2025-01-27 at 11:15 +1100, NeilBrown wrote:
>> On Mon, 27 Jan 2025, Jeff Layton wrote:
>>> On Mon, 2025-01-27 at 08:53 +1100, NeilBrown wrote:
>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
>>>>> On Sun, 2025-01-26 at 13:39 +1100, NeilBrown wrote:
>>>>>> On Sun, 26 Jan 2025, Jeff Layton wrote:
>>>>>>> nfsd_file_dispose_list_delayed can be called from the filecache
>>>>>>> laundrette, which is shut down after the nfsd threads are shut down and
>>>>>>> the nfsd_serv pointer is cleared. If nn->nfsd_serv is NULL then there
>>>>>>> are no threads to wake.
>>>>>>>
>>>>>>> Ensure that the nn->nfsd_serv pointer is non-NULL before calling
>>>>>>> svc_wake_up in nfsd_file_dispose_list_delayed. This is safe since the
>>>>>>> svc_serv is not freed until after the filecache laundrette is cancelled.
>>>>>>>
>>>>>>> Fixes: ffb402596147 ("nfsd: Don't leave work of closing files to a work queue")
>>>>>>> Reported-by: Salvatore Bonaccorso <carnil@...ian.org>
>>>>>>> Closes: https://lore.kernel.org/linux-nfs/7d9f2a8aede4f7ca9935a47e1d405643220d7946.camel@kernel.org/
>>>>>>> Signed-off-by: Jeff Layton <jlayton@...nel.org>
>>>>>>> ---
>>>>>>> This is only lightly tested, but I think it will fix the bug that
>>>>>>> Salvatore reported.
>>>>>>> ---
>>>>>>>   fs/nfsd/filecache.c | 11 ++++++++++-
>>>>>>>   1 file changed, 10 insertions(+), 1 deletion(-)
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>>>>>> index e91c164b5ea21507659904690533a19ca43b1b64..fb2a4469b7a3c077de2dd750f43239b4af6d37b0 100644
>>>>>>> --- a/fs/nfsd/filecache.c
>>>>>>> +++ b/fs/nfsd/filecache.c
>>>>>>> @@ -445,11 +445,20 @@ nfsd_file_dispose_list_delayed(struct list_head *dispose)
>>>>>>>   						struct nfsd_file, nf_gc);
>>>>>>>   		struct nfsd_net *nn = net_generic(nf->nf_net, nfsd_net_id);
>>>>>>>   		struct nfsd_fcache_disposal *l = nn->fcache_disposal;
>>>>>>> +		struct svc_serv *serv;
>>>>>>>   
>>>>>>>   		spin_lock(&l->lock);
>>>>>>>   		list_move_tail(&nf->nf_gc, &l->freeme);
>>>>>>>   		spin_unlock(&l->lock);
>>>>>>> -		svc_wake_up(nn->nfsd_serv);
>>>>>>> +
>>>>>>> +		/*
>>>>>>> +		 * The filecache laundrette is shut down after the
>>>>>>> +		 * nn->nfsd_serv pointer is cleared, but before the
>>>>>>> +		 * svc_serv is freed.
>>>>>>> +		 */
>>>>>>> +		serv = nn->nfsd_serv;
>>>>>>
>>>>>> I wonder if this should be READ_ONCE() to tell the compiler that we
>>>>>> could race with clearing nn->nfsd_serv.  Would the comment still be
>>>>>> needed?
>>>>>>
>>>>>
>>>>> I think we need a comment at least. The linkage between the laundrette
>>>>> and the nfsd_serv being set to NULL is very subtle. A READ_ONCE()
>>>>> doesn't convey that well, and is unnecessary here.
>>>>
>>>> Why do you say "is unnecessary here" ?
>>>> If the code were
>>>>     if (nn->nfsd_serv)
>>>>              svc_wake_up(nn->nfsd_serv);
>>>> that would be wrong as nn->nfds_serv could be set to NULL between the
>>>> two.
>>>> And the C compile is allowed to load the value twice because the C memory
>>>> model declares that would have the same effect.
>>>> While I doubt it would actually change how the code is compiled, I think
>>>> we should have READ_ONCE() here (and I've been wrong before about what
>>>> the compiler will actually do).
>>>>
>>>>
>>>
>>> It's unnecessary because the outcome of either case is acceptable.
>>>
>>> When racing with shutdown, either it's NULL and the laundrette won't
>>> call svc_wake_up(), or it's non-NULL and it will. In the non-NULL case,
>>> the call to svc_wake_up() will be a no-op because the threads are shut
>>> down.
>>>
>>> The vastly common case in this code is that this pointer will be non-
>>> NULL, because the server is running (i.e. not racing with shutdown). I
>>> don't see the need in making all of those accesses volatile.
>>
>> One of us is confused.  I hope it isn't me.
>>
> 
> It's probably me. I think you have a much better understanding of
> compiler design than I do. Still...
> 
>> The hypothetical problem I see is that the C compiler could generate
>> code to load the value "nn->nfsd_serv" twice.  The first time it is not
>> NULL, the second time it is NULL.
>> The first is used for the test, the second is passed to svc_wake_up().
>>
>> Unlikely though this is, it is possible and READ_ONCE() is designed
>> precisely to prevent this.
>> To quote from include/asm-generic/rwonce.h it will
>>   "Prevent the compiler from merging or refetching reads"
>>
>> A "volatile" access does not add any cost (in this case).  What it does
>> is break any aliasing that the compile might have deduced.
>> Even if the compiler thinks it has "nn->nfsd_serv" in a register, it
>> won't think it has the result of READ_ONCE(nn->nfsd_serv) in that register.
>> And if it needs the result of a previous READ_ONCE(nn->nfsd_serv) it
>> won't decide that it can just read nn->nfsd_serv again.  It MUST keep
>> the result of READ_ONCE(nn->nfsd_serv) somewhere until it is not needed
>> any more.
> 
> I'm mainly just considering the resulting pointer. There are two
> possible outcomes to the fetch of nn->nfsd_serv. Either it's a valid
> pointer that points to the svc_serv, or it's NULL. The resulting code
> can handle either case, so it doesn't seem like adding READ_ONCE() will
> create any material difference here.
> 
> Maybe I should ask it this way: What bad outcome could result if we
> don't add READ_ONCE() here?

Neil just described it. The compiler would generate two load operations,
one for the test and one for the function call argument. The first load
can retrieve a non-NULL address, and the second a NULL address.

I agree a READ_ONCE() is necessary.


-- 
Chuck Lever

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ