[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250520110625.60455f42@kernel.org>
Date: Tue, 20 May 2025 11:06:25 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Mina Almasry <almasrymina@...gle.com>
Cc: "dongchenchen (A)" <dongchenchen2@...wei.com>, hawk@...nel.org,
ilias.apalodimas@...aro.org, davem@...emloft.net, edumazet@...gle.com,
pabeni@...hat.com, horms@...nel.org, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, zhangchangzhong@...wei.com
Subject: Re: [BUG Report] KASAN: slab-use-after-free in
page_pool_recycle_in_ring
On Mon, 19 May 2025 17:53:08 -0700 Mina Almasry wrote:
> On Mon, May 19, 2025 at 3:47 PM Jakub Kicinski <kuba@...nel.org> wrote:
> >
> > On Mon, 19 May 2025 12:20:59 -0700 Mina Almasry wrote:
> > > Clearly this is not working, but I can't tell why.
> >
> > I think your fix works but for the one line that collects recycling
> > stats. If we put recycling stats under the producer lock we should
> > be safe.
>
> What are you referring to as recycle stats? Because I don't think
> pool->recycle_stats have anything to do with freeing the page_pool.
>
> Or do you mean that we should put all the call sites that increment
> and decrement pool->pages_state_release_cnt and
> pool->pages_state_hold_cnt under the producer lock?
No, just the "informational" recycling stats. Looking at what Dong
Chenchen has shared:
page_pool_recycle_in_ring
ptr_ring_produce
spin_lock(&r->producer_lock);
WRITE_ONCE(r->queue[r->producer++], ptr)
spin_unlock(&r->producer_lock);
//recycle last page to pool
page_pool_release
page_pool_scrub
page_pool_empty_ring
ptr_ring_consume
page_pool_return_page //release
all page
__page_pool_destroy
free_percpu(pool->recycle_stats);
kfree(pool) //free
recycle_stat_inc(pool, ring); //uaf read
The thread which put the last page into the ring has released the
producer lock and now it's trying to increment the recycling stats.
Which is invalid, the page it put into the right was de facto the
reference it held. So once it put that page in the ring it should
no longer touch the page pool.
It's not really related to the refcounting itself, the recycling
stats don't control the lifetime of the object. With your patch
to turn the producer lock into a proper barrier, the remaining
issue feels to me like a basic UAF?
Powered by blists - more mailing lists