[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7b0ec3c4-77a1-49cf-aadf-7d393c750f8e@oracle.com>
Date: Tue, 17 Dec 2024 13:25:43 -0500
From: Chuck Lever <chuck.lever@...cle.com>
To: Li Lingfeng <lilingfeng3@...wei.com>, cve@...nel.org,
linux-kernel@...r.kernel.org, linux-cve-announce@...r.kernel.org,
"linux-nfs@...r.kernel.org" <linux-nfs@...r.kernel.org>
Cc: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Olga Kornievskaia <okorniev@...hat.com>,
Jeff Layton <jlayton@...nel.org>, NeilBrown <neilb@...e.de>,
yangerkun <yangerkun@...wei.com>, "zhangyi (F)" <yi.zhang@...wei.com>,
Hou Tao <houtao1@...wei.com>, "yukuai (C)" <yukuai3@...wei.com>,
"chengzhihao1@...wei.com" <chengzhihao1@...wei.com>,
ZhangXiaoxu <zhangxiaoxu5@...wei.com>
Subject: Re: CVE-2024-50106: nfsd: fix race between laundromat and
free_stateid
On 12/17/24 10:30 AM, Li Lingfeng wrote:
> Hi,
> after analysis, we think that this issue is not introduced by commit
> 2d4a532d385f ("nfsd: ensure that clp->cl_revoked list is protected by
> clp->cl_lock") but by commit 83e733161fde ("nfsd: avoid race after
> unhash_delegation_locked()").
> Therefore, kernel versions earlier than 6.9 do not involve this issue.
A more practical question is: has anyone reproduced the reported crash
on a pre-v6.9 kernel?
I recall (dimly) that we knew that 8dd91e8d31fe ("nfsd: fix race between
laundromat and free_stateid") could not be cleanly applied before v6.9.
It was less clear at the time whether a more extensive LTS backport
would be required.
> // normal case 1 -- free deleg by delegreturn
> 1) OP_DELEGRETURN
> nfsd4_delegreturn
> nfsd4_lookup_stateid
> destroy_delegation
> destroy_unhashed_deleg
> nfs4_unlock_deleg_lease
> vfs_setlease // unlock
> nfs4_put_stid // put last refcount
> idr_remove // remove from cl_stateids
> s->sc_free // free deleg
>
> 2) OP_FREE_STATEID
> nfsd4_free_stateid
> find_stateid_locked // can not find the deleg in cl_stateids
>
>
> // normal case 2 -- free deleg by laundromat
> nfs4_laundromat
> state_expired
> unhash_delegation_locked // set NFS4_REVOKED_DELEG_STID
> list_add // add the deleg to reaplist
> list_first_entry // get the deleg from reaplist
> revoke_delegation
> destroy_unhashed_deleg
> nfs4_unlock_deleg_lease
> nfs4_put_stid
>
>
> // abnormal case
> nfs4_laundromat
> state_expired
> unhash_delegation_locked
> // set NFS4_REVOKED_DELEG_STID
> list_add
> // add the deleg to reaplist
> 1) OP_DELEGRETURN
> nfsd4_delegreturn
> nfsd4_lookup_stateid
> nfsd4_stid_check_stateid_generation
> nfsd4_verify_open_stid
> // check NFS4_REVOKED_DELEG_STID
> // and return nfserr_deleg_revoked
> // skip destroy_delegation
>
> 2) OP_FREE_STATEID
> nfsd4_free_stateid
> // check NFS4_REVOKED_DELEG_STID
> list_del_init
> // remove deleg from reaplist
> nfs4_put_stid
> // free deleg
> list_first_entry
> // cant not get the deleg from reaplist
>
>
> Before commit 83e733161fde ("nfsd: avoid race after
> unhash_delegation_locked()"), nfs4_laundromat --> unhash_delegation_locked
> would not set NFS4_REVOKED_DELEG_STID for the deleg.
> So the description "it marks the delegation stid revoked" in the CVE fix
> patch does not hold true. And the OP_FREE_STATEID operation will not
> release the deleg.
>
> Thanks.
>
> 在 2024/11/6 1:10, Greg Kroah-Hartman 写道:
>> Description
>> ===========
>>
>> In the Linux kernel, the following vulnerability has been resolved:
>>
>> nfsd: fix race between laundromat and free_stateid
>>
>> There is a race between laundromat handling of revoked delegations
>> and a client sending free_stateid operation. Laundromat thread
>> finds that delegation has expired and needs to be revoked so it
>> marks the delegation stid revoked and it puts it on a reaper list
>> but then it unlock the state lock and the actual delegation revocation
>> happens without the lock. Once the stid is marked revoked a racing
>> free_stateid processing thread does the following (1) it calls
>> list_del_init() which removes it from the reaper list and (2) frees
>> the delegation stid structure. The laundromat thread ends up not
>> calling the revoke_delegation() function for this particular delegation
>> but that means it will no release the lock lease that exists on
>> the file.
>>
>> Now, a new open for this file comes in and ends up finding that
>> lease list isn't empty and calls nfsd_breaker_owns_lease() which ends
>> up trying to derefence a freed delegation stateid. Leading to the
>> followint use-after-free KASAN warning:
>>
>> kernel:
>> ==================================================================
>> kernel: BUG: KASAN: slab-use-after-free in
>> nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
>> kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205
>> kernel:
>> kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted
>> 6.11.0-rc7+ #9
>> kernel: Hardware name: Apple Inc. Apple Virtualization Generic
>> Platform, BIOS 2069.0.0.0.0 08/03/2024
>> kernel: Call trace:
>> kernel: dump_backtrace+0x98/0x120
>> kernel: show_stack+0x1c/0x30
>> kernel: dump_stack_lvl+0x80/0xe8
>> kernel: print_address_description.constprop.0+0x84/0x390
>> kernel: print_report+0xa4/0x268
>> kernel: kasan_report+0xb4/0xf8
>> kernel: __asan_report_load8_noabort+0x1c/0x28
>> kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd]
>> kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd]
>> kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd]
>> kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd]
>> kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd]
>> kernel: nfsd4_open+0xa08/0xe80 [nfsd]
>> kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd]
>> kernel: nfsd_dispatch+0x22c/0x718 [nfsd]
>> kernel: svc_process_common+0x8e8/0x1960 [sunrpc]
>> kernel: svc_process+0x3d4/0x7e0 [sunrpc]
>> kernel: svc_handle_xprt+0x828/0xe10 [sunrpc]
>> kernel: svc_recv+0x2cc/0x6a8 [sunrpc]
>> kernel: nfsd+0x270/0x400 [nfsd]
>> kernel: kthread+0x288/0x310
>> kernel: ret_from_fork+0x10/0x20
>>
>> This patch proposes a fixed that's based on adding 2 new additional
>> stid's sc_status values that help coordinate between the laundromat
>> and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()).
>>
>> First to make sure, that once the stid is marked revoked, it is not
>> removed by the nfsd4_free_stateid(), the laundromat take a reference
>> on the stateid. Then, coordinating whether the stid has been put
>> on the cl_revoked list or we are processing FREE_STATEID and need to
>> make sure to remove it from the list, each check that state and act
>> accordingly. If laundromat has added to the cl_revoke list before
>> the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove
>> it from the list. If nfsd4_free_stateid() finds that operations arrived
>> before laundromat has placed it on cl_revoke list, it marks the state
>> freed and then laundromat will no longer add it to the list.
>>
>> Also, for nfsd4_delegreturn() when looking for the specified stid,
>> we need to access stid that are marked removed or freeable, it means
>> the laundromat has started processing it but hasn't finished and this
>> delegreturn needs to return nfserr_deleg_revoked and not
>> nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the
>> lack of it will leave this stid on the cl_revoked list indefinitely.
>>
>> The Linux kernel CVE team has assigned CVE-2024-50106 to this issue.
>>
>>
>> Affected and fixed versions
>> ===========================
>>
>> Issue introduced in 3.17 with commit 2d4a532d385f and fixed in
>> 6.11.6 with commit 967faa26f313
>> Issue introduced in 3.17 with commit 2d4a532d385f and fixed in
>> 6.12-rc5 with commit 8dd91e8d31fe
>>
>> Please see https://www.kernel.org for a full list of currently supported
>> kernel versions by the kernel community.
>>
>> Unaffected versions might change over time as fixes are backported to
>> older supported kernel versions. The official CVE entry at
>> https://cve.org/CVERecord/?id=CVE-2024-50106
>> will be updated if fixes are backported, please check that for the most
>> up to date information about this issue.
>>
>>
>> Affected files
>> ==============
>>
>> The file(s) affected by this issue are:
>> fs/nfsd/nfs4state.c
>> fs/nfsd/state.h
>>
>>
>> Mitigation
>> ==========
>>
>> The Linux kernel CVE team recommends that you update to the latest
>> stable kernel version for this, and many other bugfixes. Individual
>> changes are never tested alone, but rather are part of a larger kernel
>> release. Cherry-picking individual commits is not recommended or
>> supported by the Linux kernel community at all. If however, updating to
>> the latest release is impossible, the individual changes to resolve this
>> issue can be found at these commits:
>> https://git.kernel.org/stable/
>> c/967faa26f313a62e7bebc55d5b8122eaee43b929
>> https://git.kernel.org/stable/
>> c/8dd91e8d31febf4d9cca3ae1bb4771d33ae7ee5a
--
Chuck Lever
Powered by blists - more mailing lists