linux-kernel - Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e07cd398-4388-470a-b367-cecddaaf4b39@huaweicloud.com>
Date: Wed, 22 Jan 2025 10:21:25 +0800
From: Li Lingfeng <lilingfeng@...weicloud.com>
To: Li Lingfeng <lilingfeng3@...wei.com>, NeilBrown <neilb@...e.de>,
 Jeff Layton <jlayton@...nel.org>
Cc: Chuck Lever <chuck.lever@...cle.com>, okorniev@...hat.com,
 Dai.Ngo@...cle.com, tom@...pey.com, linux-nfs@...r.kernel.org,
 linux-kernel@...r.kernel.org, yukuai1@...weicloud.com, houtao1@...wei.com,
 yi.zhang@...wei.com, yangerkun@...wei.com
Subject: Re: [PATCH] nfsd: free nfsd_file by gc after adding it to lru list


在 2025/1/22 9:43, Li Lingfeng 写道:
>
> 在 2025/1/22 9:15, NeilBrown 写道:
>> On Wed, 22 Jan 2025, Jeff Layton wrote:
>>> On Wed, 2025-01-15 at 10:03 -0500, Chuck Lever wrote:
>>>> On 1/14/25 2:39 PM, Jeff Layton wrote:
>>>>> On Tue, 2025-01-14 at 14:27 -0500, Jeff Layton wrote:
>>>>>> On Mon, 2025-01-13 at 10:59 +0800, Li Lingfeng wrote:
>>>>>>> In nfsd_file_put, after inserting the nfsd_file into the 
>>>>>>> nfsd_file_lru
>>>>>>> list, gc may be triggered in another thread and immediately 
>>>>>>> release this
>>>>>>> nfsd_file, which will lead to a UAF when accessing this 
>>>>>>> nfsd_file again.
>>>>>>>
>>>>>>> All the places where unhash is done will also perform 
>>>>>>> lru_remove, so there
>>>>>>> is no need to do lru_remove separately here. After inserting the 
>>>>>>> nfsd_file
>>>>>>> into the nfsd_file_lru list, it can be released by relying on gc.
>>>>>>>
>>>>>>> Fixes: 4a0e73e635e3 ("NFSD: Leave open files out of the 
>>>>>>> filecache LRU")
>>>>>>> Signed-off-by: Li Lingfeng <lilingfeng3@...wei.com>
>>>>>>> ---
>>>>>>>    fs/nfsd/filecache.c | 12 ++----------
>>>>>>>    1 file changed, 2 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
>>>>>>> index a1cdba42c4fa..37b65cb1579a 100644
>>>>>>> --- a/fs/nfsd/filecache.c
>>>>>>> +++ b/fs/nfsd/filecache.c
>>>>>>> @@ -372,18 +372,10 @@ nfsd_file_put(struct nfsd_file *nf)
>>>>>>>            /* Try to add it to the LRU.  If that fails, 
>>>>>>> decrement. */
>>>>>>>            if (nfsd_file_lru_add(nf)) {
>>>>>>>                /* If it's still hashed, we're done */
>>>>>>> -            if (test_bit(NFSD_FILE_HASHED, &nf->nf_flags)) {
>>>>>>> +            if (list_lru_count(&nfsd_file_lru))
>>>>>>>                    nfsd_file_schedule_laundrette();
>>>>>>> -                return;
>>>>>>> -            }
>>>>>>>    -            /*
>>>>>>> -             * We're racing with unhashing, so try to remove it 
>>>>>>> from
>>>>>>> -             * the LRU. If removal fails, then someone else 
>>>>>>> already
>>>>>>> -             * has our reference.
>>>>>>> -             */
>>>>>>> -            if (!nfsd_file_lru_remove(nf))
>>>>>>> -                return;
>>>>>>> +            return;
>>>>>>>            }
>>>>>>>        }
>>>>>>>        if (refcount_dec_and_test(&nf->nf_ref))
>>>>>> I think this looks OK. Filecache bugs are particularly nasty 
>>>>>> though, so
>>>>>> let's run this through a nice long testing cycle.
>>>>>>
>>>>>> Reviewed-by: Jeff Layton <jlayton@...nel.org>
>>>>> Actually, I take it back. This is problematic in another way.
>>>>>
>>>>> In this case, we're racing with another task that is unhashing the
>>>>> object, but we've put it on the LRU ourselves. What guarantee do we
>>>>> have that the unhashing and removal from the LRU didn't occur before
>>>>> this task called nfsd_file_lru_add()? That's why we attempt to remove
>>>>> it here -- we can't rely on the task that unhashed it to do so at 
>>>>> that
>>>>> point.
>>>>>
>>>>> What might be best is to take and hold the rcu_read_lock() before 
>>>>> doing
>>>>> the nfsd_file_lru_add, and just release it after we do these racy
>>>>> checks. That should make it safe to access the object.
>>>>>
>>>>> Thoughts?
>>>> Holding the RCU read lock will keep the dereferences safe since
>>>> nfsd_file objects are freed only after an RCU grace period. But 
>>>> will the
>>>> logic in nfsd_file_put() work properly on totally dead nfsd_file
>>>> objects? I don't see a specific failure mode there, but I'm not very
>>>> imaginative.
>>>>
>>>> Overall, I think RCU would help.
>>>>
>>>>
>>> To be clear, I think we need to drop e57420be100ab from your nfsd-
>>> testing branch. The race I identified above is quite likely to occur
>>> and could lead to leaks.
>>>
>>> If Li Lingfeng doesn't propose a patch, I'll spin one up tomorrow. I
>>> think the RCU approach is safe.
>> I'm not convinced this is the right approach.
>> I cannot see how nfsd_file_put() can race with unhashing.  If it cannot
>> then we can simply unconditionally call nfsd_file_schedule_laundrette().
>>
>> Can describe how the race can happen - if indeed it can.
> At the beginning, our testing team discovered this issue on 5.10 using
> their own test cases (I'm not clear on how their test cases were
> constructed).
> Then, I found that there might be a problem here, so I reproduced the
> phenomenon in the above way.
> I will ask if their test cases can be shared externally or if they can
> describe what their test cases did.
They used three machines for the test: one server and two clients.
ClientA continuously creates and deletes a file, while ClientB uses fio to
perform random writes on this file.

I found the original stack trace (since I only have a screenshot, I can
only extract some important information from it):
list_del corruption. prev-next should be ffff000052be6828, but was 
ffff000052221e48
WARNING:CPU: 1 PID: 12224at lib/list_debug.c:95 
list_del_entry_valid+Oxe0/0x10
...
Call trace:
  list_del_entry_valid+Oxe0/0x10c
  list_lru_del+0xb4/0x19c
  nfsd_file_put+0x134/0xla0 [nfsd]
  nfsd_read+0xc0/0xldo [nfsd
  nfsd3_proc_read+0xl44/0xla0 [nfsd]
  nfsd_dispatch+0x190/0x254 [nfsd]
  svc_process_common+Ox52c/0x780 [sunrpc]
  svc_process+Ox88/0xe4 [sunrpc]
  nfsd+0xb4/0x160 [nfsd]
  kthread+0x108/0x134
  ret_from_fork+ох10/0xl8
---[ end trace f87713f92e49e854 ]---

refcount_t: underflow;use-after-free.
WARNING:CPU: 1 PID: 12220 at lib/refcount.c:28 
refcount_warn_saturate+0xf4/0xl44
...
Call trace:
  refcount_warn_saturate+0xf4/0x144
  nfsd_file_put+0x160/0xla0 [nfsd]
  nfsd_read+oxco/0xld0 [nfsd]
  nfsd3_proc_read+0xl44/0xla0 [nfsd]
  nfsd_dispatch+0x190/0x254 [nfsd]
  svc_process_common+Ox52c/0x780[sunrpc]
  svc_process+0x88/0xe4_ [sunrpc]
  nfsd+0xb4/0x160 [nfsd]
  kthread+0x108/0x134
  ret_from_fork+0x10/0x18
  ---[ end tracef87713f92e49e853 ]---

After applying this modification, this test case no longer encountered any
issues.
>> Note that we also need rcu protection in nfsd_file_lru_add() so that the
>> nf doesn't get freed after it is added the the lru and before the trace
>> point.  If we don't end up needing rcu round the call, we will need it
>> in the call.
>>
>> Thanks,
>> NeilBrown