netdev - Re: [BUG Report] KASAN: slab-use-after-free in page_pool_recycle_in

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izPh5Z-CAJpQzDjhLVN5ye=5i1zaDqb2xQOU3QP08f+Y0Q@mail.gmail.com>
Date: Mon, 19 May 2025 12:20:59 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: "dongchenchen (A)" <dongchenchen2@...wei.com>
Cc: hawk@...nel.org, ilias.apalodimas@...aro.org, davem@...emloft.net, 
	edumazet@...gle.com, kuba@...nel.org, pabeni@...hat.com, horms@...nel.org, 
	netdev@...r.kernel.org, linux-kernel@...r.kernel.org, 
	zhangchangzhong@...wei.com
Subject: Re: [BUG Report] KASAN: slab-use-after-free in page_pool_recycle_in_ring

On Tue, May 13, 2025 at 8:11 PM dongchenchen (A)
<dongchenchen2@...wei.com> wrote:
>
>
> > On Tue, May 13, 2025 at 1:28 AM Dong Chenchen <dongchenchen2@...wei.com> wrote:
> >> Hello,
> >>
> >> syzkaller found the UAF issue in page_pool_recycle_in_ring[1], which is
> >> similar to syzbot+204a4382fcb3311f3858@...kaller.appspotmail.com.
> >>
> >> root cause is as follow:
> >>
> >> page_pool_recycle_in_ring
> >>    ptr_ring_produce
> >>      spin_lock(&r->producer_lock);
> >>      WRITE_ONCE(r->queue[r->producer++], ptr)
> >>        //recycle last page to pool
> >>                                  page_pool_release
> >>                                    page_pool_scrub
> >>                                      page_pool_empty_ring
> >>                                        ptr_ring_consume
> >>                                        page_pool_return_page //release all page
> >>                                    __page_pool_destroy
> >>                                       free_percpu(pool->recycle_stats);
> >>                                       kfree(pool) //free
> >>
> >>       spin_unlock(&r->producer_lock); //pool->ring uaf read
> >>    recycle_stat_inc(pool, ring);
> >>
> >> page_pool can be free while page pool recycle the last page in ring.
> >> After adding a delay to the page_pool_recycle_in_ring(), syzlog[2] can
> >> reproduce this issue with a high probability. Maybe we can fix it by
> >> holding the user_cnt of the page pool during the page recycle process.
> >>
> >> Does anyone have a good idea to solve this problem?
> >>
> > Ugh. page_pool_release is not supposed to free the page_pool until all
> > inflight pages have been returned. It detects that there are pending
> > inflight pages by checking the atomic stats, but in this case it looks
> > like we've raced checking the atomic stats with another cpu returning
> > a netmem to the ptr ring (and it updates the stats _after_ it already
> > returned to the ptr_ring).
> >
> > My guess here is that page_pool_scrub needs to acquire the
> > r->producer_lock to make sure there are no other producers returning
> > netmems to the ptr_ring while it's scrubbing them (and checking after
> > to make sure there are no inflight netmems).
> >
> > Can you test this fix? It may need some massaging. I only checked it
> > builds. I haven't thought through all the possible races yet:
> >
> > ```
> > diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> > index 2b76848659418..8654608734773 100644
> > --- a/net/core/page_pool.c
> > +++ b/net/core/page_pool.c
> > @@ -1146,10 +1146,17 @@ static void page_pool_scrub(struct page_pool *pool)
> >
> >   static int page_pool_release(struct page_pool *pool)
> >   {
> > +       bool in_softirq;
> >          int inflight;
> >
> > +
> > +       /* Acquire producer lock to make sure we don't race with another thread
> > +        * returning a netmem to the ptr_ring.
> > +        */
> > +       in_softirq = page_pool_producer_lock(pool);
> >          page_pool_scrub(pool);
> >          inflight = page_pool_inflight(pool, true);
> > +       page_pool_producer_unlock(pool, in_softirq);
> >          if (!inflight)
> >                  __page_pool_destroy(pool);
> > ```
> Hi, Mina!
>
> I tested this patch and the problem still exists.
> Although this patch ensures that lock access is safe, the page recycle
> process
> can access the page pool after unlock.
>

Sorry for the very late reply; got a bit busy with some work work.

My initial analysis was wrong as the test shows with the candidate
fix. I took another look, and here is what I can tell so far. The full
syzbot report is here for reference:

https://syzkaller.appspot.com/bug?extid=204a4382fcb3311f3858

page_pool_release_retry is supposed to block freeing the page_pool
until all netmems have been freed via page_pool_put_unrefed_netmem
using the inflight logic. What is clear from the syzbot report is that
this inflight logic didn't work properly, because the
page_pool_put_unrefed_netmem call happened after
page_pool_release_retry has allowed the page_pool to be freed
(__page_pool_destroy has already been called).

The inflight logic works by taking the diff between
`pool->pages_state_release_cnt` and `pool->pages_state_hold_cnt`.
pages_state_hold_cnt is incremented when the page_pool allocates a new
page from the buddy allocator. pages_state_hold_cnt is incremented at
the end of the page_pool_put_unrefed_netmem.

We don't expect new pages to be allocated by the page_pool owner after
page_pool_destroy has been called, so pages_state_hold_cnt is supposed
to not move after page_pool_destroy is called I think.
pages_state_release_cnt should be <= pages_state_hold_cnt at the time
of page_pool_destroy is called. Then when all the inflight netmems
have been freed via page_pool_put_unrefed_netmem,
pool->pages_state_release_cnt should be == to
pool->pages_state_hold_cnt, and the page_pool should be allowed to be
freed.

Clearly this is not working, but I can't tell why. I also notice the
syzbot report is from the bpf/test_run.c, but I don't think we have
reports from prod, so it may be a test issue. Some possibilities:

1. Maybe the test is calling a page_pool allocation like
page_pool_dev_alloc_pages in parallel with page_pool_destroy. That may
increment pages_state_hold_cnt unexpectedly?

2.  Maybe one of the pages_state_*_cnt overflowed or something?

3. Memory corruption?

I'm afraid I'm not sure. Possibly littering the code with warnings for
unexpected cases would give some insight. For example, I think this
would catch case #1:

```
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index 4011eb305cee..9fa70c60f9b5 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -536,6 +536,9 @@ static struct page
*__page_pool_alloc_page_order(struct page_pool *pool,
        alloc_stat_inc(pool, slow_high_order);
        page_pool_set_pp_info(pool, page_to_netmem(page));

+       /* Warn if we're allocating a page on a destroyed page_pool */
+       DEBUG_NET_WARN_ON(pool->destroy_cnt);
+
        /* Track how many pages are held 'in-flight' */
        pool->pages_state_hold_cnt++;
        trace_page_pool_state_hold(pool, page_to_netmem(page),
@@ -582,6 +585,8 @@ static noinline netmem_ref
__page_pool_alloc_pages_slow(struct page_pool *pool,
                page_pool_set_pp_info(pool, netmem);
                pool->alloc.cache[pool->alloc.count++] = netmem;
                /* Track how many pages are held 'in-flight' */
+               /* Warn if we're allocating a page on a destroyed page_pool */
+               DEBUG_NET_WARN_ON(pool->destroy_cnt);
                pool->pages_state_hold_cnt++;
                trace_page_pool_state_hold(pool, netmem,
                                           pool->pages_state_hold_cnt);
@@ -1271,6 +1276,8 @@ void net_mp_niov_set_page_pool(struct page_pool
*pool, struct net_iov *niov)

        page_pool_set_pp_info(pool, netmem);

+       /* Warn if we're allocating a page on a destroyed page_pool */
+       DEBUG_NET_WARN_ON(pool->destroy_cnt);
        pool->pages_state_hold_cnt++;
        trace_page_pool_state_hold(pool, netmem, pool->pages_state_hold_cnt);
```

If you have steps to repro this, maybe post them and I'll try to take
a look when I get a chance if you can't root cause this further.

--
Thanks,
Mina