[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58a42dfb-38f8-4935-ac4d-4ec34c3c9504@bytedance.com>
Date: Fri, 20 Oct 2023 23:19:20 +0800
From: Qi Zheng <zhengqi.arch@...edance.com>
To: Matthew Wilcox <willy@...radead.org>
Cc: Petr Mladek <pmladek@...e.com>,
Zhang Zhiyu <zhiyuzhang999@...il.com>,
linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>, secalert@...hat.com
Subject: Re: KASAN: slab-use-after-free Read in radix_tree_lookup in&after
Linux Kernel 6.4-rc6
Hi Matthew,
On 2023/10/20 22:58, Matthew Wilcox wrote:
> On Fri, Oct 20, 2023 at 09:51:18PM +0800, Qi Zheng wrote:
>> Hi all,
>>
>> On 2023/10/20 20:34, Matthew Wilcox wrote:
>>> On Fri, Oct 20, 2023 at 10:26:31AM +0200, Petr Mladek wrote:
>>>> Adding Matthew into Cc in the hope that he is still familiar with the
>>>> code. Also adding Andrew who accepts patches.
>>>
>>> oh joy. i love dealing with cves.
>>>
>>>>>> I agree, this issue looks to be in kernel-core radix tree code in ./lib/radix-tree.c in two of any places.
>>>
>>> the radix tree code is the victim here. maybe also the perpetrator, but
>>> it's rather hard to say.
>>>
>>> shrink_slab_memcg()
>>> down_read_trylock(&shrinker_rwsem)
>>> shrinker = idr_find(&shrinker_idr, i);
>>>
>>> i assume is the path to this bug. the reporter didn't run the
>>> stacktrace through scripts/decode_stacktrace.sh so it's less useful than
>>> we might want.
>>>
>>> prealloc_memcg_shrinker() calls idr_alloc and idr_remove under
>>> shrinker_rwsem in write mode, so that should be fine.
>>>
>>> unregister_memcg_shrinker() calls idr_remove after asserting &shrinker_rwsem
>>> is held (although not asserting that it's held for write ... hmm ... but
>>> both callers appear to hold it for write anyway)
>>>
>>> so i don't see why we'd get a UAF here.
>>>
>>> anyway, adding Qi Zheng to the cc since they're responsible for the
>>> shrinker code.
>>
>> Thanks for CC'ing me, I'd be happy to troubleshoot any issues that may
>> be shrinker related.
>>
>> Between v6.4-rc1 and v6.4 versions, we briefly implemented lockless slab
>> shrink using the SRCU method. In these versions, we call idr_alloc and
>> idr_remove under shrinker_mutex, and idr_find under srcu_read_lock.
>>
>> These are all legitimate uses of the IDR APIs and the shrinker_idr
>> will never be destroyed, so at a quick glance I didn't see why it would
>> cause UAF here.
>
> I'm not an expert on how all the RCU flavours interact, but I don't
> think that's safe. The IDR (radix tree) will RCU-free nodes, but I
> don't think holding the srcu_read_lock is enough to prevent the nodes
> being freed.
Oh, Indeed, I just saw the Documeentation/RCU/checklist.rst:
```
If the updater uses call_rcu() or synchronize_rcu(), then
the corresponding readers may use: (1) rcu_read_lock() and
rcu_read_unlock(), (2) any pair of primitives that disables
and re-enables softirq, for example, rcu_read_lock_bh() and
rcu_read_unlock_bh(), or (3) any pair of primitives that disables
and re-enables preemption, for example, rcu_read_lock_sched() and
rcu_read_unlock_sched(). If the updater uses synchronize_srcu()
or call_srcu(), then the corresponding readers must use
srcu_read_lock() and srcu_read_unlock(), and with the same
srcu_struct.
```
> I think you'd need to take the rcu_read_lock() around
> the call to idr_find().
For latest RCU+refcount method, we call idr_find() under
rcu_read_lock(), so it's safe.
Thanks,
Qi
>
>> Anyway I will keep working on this issue, and it would be nice if
>> there was a way to reproduce it.
>
> So I think the CVE is inappropriately issued. The SRCU code was added in
> v6.4-rc1 and removed before v6.4. I don't think CVEs are appropriate for
> bugs which only existed in development kernels. How do we revoke CVEs?
Powered by blists - more mailing lists