[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAHS8izPTDmBKkwdhE3niaKgh_qd9y-Nd2JcjG4-P59erKTCTLQ@mail.gmail.com>
Date: Mon, 26 May 2025 19:13:44 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: "dongchenchen (A)" <dongchenchen2@...wei.com>
Cc: Yunsheng Lin <linyunsheng@...wei.com>, hawk@...nel.org, ilias.apalodimas@...aro.org,
davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com,
horms@...nel.org, netdev@...r.kernel.org, linux-kernel@...r.kernel.org,
zhangchangzhong@...wei.com,
syzbot+204a4382fcb3311f3858@...kaller.appspotmail.com
Subject: Re: [PATCH net] page_pool: Fix use-after-free in page_pool_recycle_in_ring
On Mon, May 26, 2025 at 6:53 PM dongchenchen (A)
<dongchenchen2@...wei.com> wrote:
>
>
> > )
> >
> > On Mon, May 26, 2025 at 7:47 AM dongchenchen (A)
> > <dongchenchen2@...wei.com> wrote:
> >>
> >>> On Fri, May 23, 2025 at 1:31 AM Yunsheng Lin <linyunsheng@...wei.com> wrote:
> >>>> On 2025/5/23 14:45, Dong Chenchen wrote:
> >>>>
> >>>>> static bool page_pool_recycle_in_ring(struct page_pool *pool, netmem_ref netmem)
> >>>>> {
> >>>>> + bool in_softirq;
> >>>>> int ret;
> >>>> int -> bool?
> >>>>
> >>>>> /* BH protection not needed if current is softirq */
> >>>>> - if (in_softirq())
> >>>>> - ret = ptr_ring_produce(&pool->ring, (__force void *)netmem);
> >>>>> - else
> >>>>> - ret = ptr_ring_produce_bh(&pool->ring, (__force void *)netmem);
> >>>>> -
> >>>>> - if (!ret) {
> >>>>> + in_softirq = page_pool_producer_lock(pool);
> >>>>> + ret = !__ptr_ring_produce(&pool->ring, (__force void *)netmem);
> >>>>> + if (ret)
> >>>>> recycle_stat_inc(pool, ring);
> >>>>> - return true;
> >>>>> - }
> >>>>> + page_pool_producer_unlock(pool, in_softirq);
> >>>>>
> >>>>> - return false;
> >>>>> + return ret;
> >>>>> }
> >>>>>
> >>>>> /* Only allow direct recycling in special circumstances, into the
> >>>>> @@ -1091,10 +1088,14 @@ static void page_pool_scrub(struct page_pool *pool)
> >>>>>
> >>>>> static int page_pool_release(struct page_pool *pool)
> >>>>> {
> >>>>> + bool in_softirq;
> >>>>> int inflight;
> >>>>>
> >>>>> page_pool_scrub(pool);
> >>>>> inflight = page_pool_inflight(pool, true);
> >>>>> + /* Acquire producer lock to make sure producers have exited. */
> >>>>> + in_softirq = page_pool_producer_lock(pool);
> >>>>> + page_pool_producer_unlock(pool, in_softirq);
> >>>> Is a compiler barrier needed to ensure compiler doesn't optimize away
> >>>> the above code?
> >>>>
> >>> I don't want to derail this conversation too much, and I suggested a
> >>> similar fix to this initially, but now I'm not sure I understand why
> >>> it works.
> >>>
> >>> Why is the existing barrier not working and acquiring/releasing the
> >>> producer lock fixes this issue instead? The existing barrier is the
> >>> producer thread incrementing pool->pages_state_release_cnt, and
> >>> page_pool_release() is supposed to block the freeing of the page_pool
> >>> until it sees the
> >>> `atomic_inc_return_relaxed(&pool->pages_state_release_cnt);` from the
> >>> producer thread. Any idea why this barrier is not working? AFAIU it
> >>> should do the exact same thing as acquiring/dropping the producer
> >>> lock.
> >> Hi, Mina
> >> As previously mentioned:
> >> page_pool_recycle_in_ring
> >> ptr_ring_produce
> >> spin_lock(&r->producer_lock);
> >> WRITE_ONCE(r->queue[r->producer++], ptr)
> >> //recycle last page to pool, producer + release_cnt = hold_cnt
> > This is not right. release_cnt != hold_cnt at this point.
>
> Hi,Mina!
> Thanks for your review!
> release_cnt != hold_cnt at this point. producer inc r->producer
> and release_cnt will be incremented by page_pool_empty_ring() in
> page_pool_release().
>
> > Release_cnt is only incremented by the producer _after_ the
> > spin_unlock and the recycle_stat_inc have been done. The full call
> > stack on the producer thread:
> >
> > page_pool_put_unrefed_netmem
> > page_pool_recycle_in_ring
> > ptr_ring_produce(&pool->ring, (__force void *)netmem);
> > spin_lock(&r->producer_lock);
> > __ptr_ring_produce(r, ptr);
> > spin_unlock(&r->producer_lock);
> > recycle_stat_inc(pool, ring);
>
> If page_ring is not full, page_pool_recycle_in_ring will return true.
> The release cnt will be incremented by page_pool_empty_ring() in
> page_pool_release(), and the code as below will not be executed.
>
> page_pool_put_unrefed_netmem
> if (!page_pool_recycle_in_ring(pool, netmem)) //return true
> page_pool_return_page(pool, netmem);
>
Oh! Thanks! I see the race now.
page_pool_recycle_in_ring in the producer can return the page to the
ring. Then the consumer will see the netmem in the ring, free it,
increment release_cnt, and free the page_pool. Then the producer
continues executing and hits a UAF. Very subtle race indeed. Thanks
for the patient explanation.
Reviewed-by: Mina Almasry <almasrymina@...gle.com>
--
Thanks,
Mina
Powered by blists - more mailing lists