lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <58a42dfb-38f8-4935-ac4d-4ec34c3c9504@bytedance.com>
Date:   Fri, 20 Oct 2023 23:19:20 +0800
From:   Qi Zheng <zhengqi.arch@...edance.com>
To:     Matthew Wilcox <willy@...radead.org>
Cc:     Petr Mladek <pmladek@...e.com>,
        Zhang Zhiyu <zhiyuzhang999@...il.com>,
        linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>, secalert@...hat.com
Subject: Re: KASAN: slab-use-after-free Read in radix_tree_lookup in&after
 Linux Kernel 6.4-rc6

Hi Matthew,

On 2023/10/20 22:58, Matthew Wilcox wrote:
> On Fri, Oct 20, 2023 at 09:51:18PM +0800, Qi Zheng wrote:
>> Hi all,
>>
>> On 2023/10/20 20:34, Matthew Wilcox wrote:
>>> On Fri, Oct 20, 2023 at 10:26:31AM +0200, Petr Mladek wrote:
>>>> Adding Matthew into Cc in the hope that he is still familiar with the
>>>> code. Also adding Andrew who accepts patches.
>>>
>>> oh joy.  i love dealing with cves.
>>>
>>>>>> I agree, this issue looks to be in kernel-core radix tree code in ./lib/radix-tree.c in two of any places.
>>>
>>> the radix tree code is the victim here.  maybe also the perpetrator, but
>>> it's rather hard to say.
>>>
>>> shrink_slab_memcg()
>>> 	down_read_trylock(&shrinker_rwsem)
>>> 	shrinker = idr_find(&shrinker_idr, i);
>>>
>>> i assume is the path to this bug.  the reporter didn't run the
>>> stacktrace through scripts/decode_stacktrace.sh so it's less useful than
>>> we might want.
>>>
>>> prealloc_memcg_shrinker() calls idr_alloc and idr_remove under
>>> shrinker_rwsem in write mode, so that should be fine.
>>>
>>> unregister_memcg_shrinker() calls idr_remove after asserting &shrinker_rwsem
>>> is held (although not asserting that it's held for write ... hmm ... but
>>> both callers appear to hold it for write anyway)
>>>
>>> so i don't see why we'd get a UAF here.
>>>
>>> anyway, adding Qi Zheng to the cc since they're responsible for the
>>> shrinker code.
>>
>> Thanks for CC'ing me, I'd be happy to troubleshoot any issues that may
>> be shrinker related.
>>
>> Between v6.4-rc1 and v6.4 versions, we briefly implemented lockless slab
>> shrink using the SRCU method. In these versions, we call idr_alloc and
>> idr_remove under shrinker_mutex, and idr_find under srcu_read_lock.
>>
>> These are all legitimate uses of the IDR APIs and the shrinker_idr
>> will never be destroyed, so at a quick glance I didn't see why it would
>> cause UAF here.
> 
> I'm not an expert on how all the RCU flavours interact, but I don't
> think that's safe.  The IDR (radix tree) will RCU-free nodes, but I
> don't think holding the srcu_read_lock is enough to prevent the nodes
> being freed.  

Oh, Indeed, I just saw the Documeentation/RCU/checklist.rst:

```
	If the updater uses call_rcu() or synchronize_rcu(), then
	the corresponding readers may use:  (1) rcu_read_lock() and
	rcu_read_unlock(), (2) any pair of primitives that disables
	and re-enables softirq, for example, rcu_read_lock_bh() and
	rcu_read_unlock_bh(), or (3) any pair of primitives that disables
	and re-enables preemption, for example, rcu_read_lock_sched() and
	rcu_read_unlock_sched().  If the updater uses synchronize_srcu()
	or call_srcu(), then the corresponding readers must use
	srcu_read_lock() and srcu_read_unlock(), and with the same
	srcu_struct.
```

> I think you'd need to take the rcu_read_lock() around
> the call to idr_find().
For latest RCU+refcount method, we call idr_find() under
rcu_read_lock(), so it's safe.

Thanks,
Qi

> 
>> Anyway I will keep working on this issue, and it would be nice if
>> there was a way to reproduce it.
> 
> So I think the CVE is inappropriately issued.  The SRCU code was added in
> v6.4-rc1 and removed before v6.4.  I don't think CVEs are appropriate for
> bugs which only existed in development kernels.  How do we revoke CVEs?

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ