lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5305c0d1-c7eb-4c79-96ae-67375f6248f1@huawei.com>
Date: Mon, 26 May 2025 22:47:36 +0800
From: "dongchenchen (A)" <dongchenchen2@...wei.com>
To: Mina Almasry <almasrymina@...gle.com>, Yunsheng Lin
	<linyunsheng@...wei.com>
CC: <hawk@...nel.org>, <ilias.apalodimas@...aro.org>, <davem@...emloft.net>,
	<edumazet@...gle.com>, <kuba@...nel.org>, <pabeni@...hat.com>,
	<horms@...nel.org>, <netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
	<zhangchangzhong@...wei.com>,
	<syzbot+204a4382fcb3311f3858@...kaller.appspotmail.com>
Subject: Re: [PATCH net] page_pool: Fix use-after-free in
 page_pool_recycle_in_ring


> On Fri, May 23, 2025 at 1:31 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
>> On 2025/5/23 14:45, Dong Chenchen wrote:
>>
>>>   static bool page_pool_recycle_in_ring(struct page_pool *pool, netmem_ref netmem)
>>>   {
>>> +     bool in_softirq;
>>>        int ret;
>> int -> bool?
>>
>>>        /* BH protection not needed if current is softirq */
>>> -     if (in_softirq())
>>> -             ret = ptr_ring_produce(&pool->ring, (__force void *)netmem);
>>> -     else
>>> -             ret = ptr_ring_produce_bh(&pool->ring, (__force void *)netmem);
>>> -
>>> -     if (!ret) {
>>> +     in_softirq = page_pool_producer_lock(pool);
>>> +     ret = !__ptr_ring_produce(&pool->ring, (__force void *)netmem);
>>> +     if (ret)
>>>                recycle_stat_inc(pool, ring);
>>> -             return true;
>>> -     }
>>> +     page_pool_producer_unlock(pool, in_softirq);
>>>
>>> -     return false;
>>> +     return ret;
>>>   }
>>>
>>>   /* Only allow direct recycling in special circumstances, into the
>>> @@ -1091,10 +1088,14 @@ static void page_pool_scrub(struct page_pool *pool)
>>>
>>>   static int page_pool_release(struct page_pool *pool)
>>>   {
>>> +     bool in_softirq;
>>>        int inflight;
>>>
>>>        page_pool_scrub(pool);
>>>        inflight = page_pool_inflight(pool, true);
>>> +     /* Acquire producer lock to make sure producers have exited. */
>>> +     in_softirq = page_pool_producer_lock(pool);
>>> +     page_pool_producer_unlock(pool, in_softirq);
>> Is a compiler barrier needed to ensure compiler doesn't optimize away
>> the above code?
>>
> I don't want to derail this conversation too much, and I suggested a
> similar fix to this initially, but now I'm not sure I understand why
> it works.
>
> Why is the existing barrier not working and acquiring/releasing the
> producer lock fixes this issue instead? The existing barrier is the
> producer thread incrementing pool->pages_state_release_cnt, and
> page_pool_release() is supposed to block the freeing of the page_pool
> until it sees the
> `atomic_inc_return_relaxed(&pool->pages_state_release_cnt);` from the
> producer thread. Any idea why this barrier is not working? AFAIU it
> should do the exact same thing as acquiring/dropping the producer
> lock.

Hi, Mina
As previously mentioned:
page_pool_recycle_in_ring
   ptr_ring_produce
     spin_lock(&r->producer_lock);
     WRITE_ONCE(r->queue[r->producer++], ptr)
       //recycle last page to pool, producer + release_cnt = hold_cnt
				page_pool_release
				  page_pool_scrub
				    page_pool_empty_ring
				      ptr_ring_consume
				        page_pool_return_page
				       //release_cnt=hold_cnt
				  __page_pool_destroy //inflight=0
				     free_percpu(pool->recycle_stats);
				     free(pool) //free
      spin_unlock(&r->producer_lock); //pool->ring uaf read
   recycle_stat_inc(pool, ring);

release_cnt can block the freeing of the page_pool until it sees the
(release_cnt = hold_cnt) from the producer thread.
However, page_pool_release() can be executed simultaneously when a page
is recycle (e.g. kfree_skb). page_pool release_cnt will increase after
the producer is written, then pool can be free and pool read in producer
will trigger UAF.
So adding a producer lock barrier to wait for recycle process to
complete can fix it.

Best Regards,
Dong Chenchen

>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ