[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHS8izNyJWhcRro8OFPTqsh9J4LEbm7Le6-CiW_oxi2NopAqeQ@mail.gmail.com>
Date: Thu, 25 Apr 2024 13:42:24 -0700
From: Mina Almasry <almasrymina@...gle.com>
To: Dragos Tatulea <dtatulea@...dia.com>
Cc: "davem@...emloft.net" <davem@...emloft.net>, "kuba@...nel.org" <kuba@...nel.org>,
"ilias.apalodimas@...aro.org" <ilias.apalodimas@...aro.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"jacob.e.keller@...el.com" <jacob.e.keller@...el.com>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>, "edumazet@...gle.com" <edumazet@...gle.com>,
Jianbo Liu <jianbol@...dia.com>, "pabeni@...hat.com" <pabeni@...hat.com>
Subject: Re: [RFC PATCH] net: Fix one page_pool page leak from skb_frag_unref
On Thu, Apr 25, 2024 at 12:48 PM Dragos Tatulea <dtatulea@...dia.com> wrote:
>
> On Thu, 2024-04-25 at 12:20 -0700, Mina Almasry wrote:
> > On Thu, Apr 25, 2024 at 1:17 AM Dragos Tatulea <dtatulea@...dia.com> wrote:
> > >
> > > On Wed, 2024-04-24 at 15:08 -0700, Mina Almasry wrote:
> > > > If that doesn't work, I think I prefer
> > > > reverting a580ea994fd3 ("net: mirror skb frag ref/unref helpers")
> > > > rather than merging this fix to make sure we removed the underlying
> > > > cause of the issue.
> > > This is the safest bet.
> > >
> > > So, to recap, I see 2 possibilities:
> > >
> > > 1) Revert a580ea994fd3 ("net: mirror skb frag ref/unref helpers"): safe, but it
> > > will probably have to come back in one way or another.
> > > 2) Drop the recycle checks from skb_frag_ref/unref: this enforces the rule of
> > > always referencing/dereferencing pages based on their type (page_pool or
> > > normal).
> > >
> >
> > If this works, I would be very happy. I personally think ref/unref
> > should be done based on the page type. For me the net stack using the
> > regular {get|put}_page on a pp page isn't great. It requires special
> > handling to make sure the ref + unref are in sync. Also if the last pp
> > ref is dropped while there are pending regular refs,
> > __page_pool_page_can_be_recycled() check will fail and the page will
> > not be recycled.
> >
> > On the other hand, since 0a149ab78ee2 ("page_pool: transition to
> > reference count management after page draining") I'm not sure there is
> > any reason to continue to use get/put_page on pp-pages, we can use the
> > new pp-ref instead.
> >
> > I don't see any regressions with this diff (needs cleanup), but your
> > test setup seems much much better than mine (I think this is the
> > second reffing issue you manage to repro):
> >
> > diff --git a/include/linux/skbuff_ref.h b/include/linux/skbuff_ref.h
> > index 4dcdbe9fbc5f..4c72227dce1b 100644
> > --- a/include/linux/skbuff_ref.h
> > +++ b/include/linux/skbuff_ref.h
> > @@ -31,7 +31,7 @@ static inline bool napi_pp_get_page(struct page *page)
> > static inline void skb_page_ref(struct page *page, bool recycle)
> > {
> > #ifdef CONFIG_PAGE_POOL
> > - if (recycle && napi_pp_get_page(page))
> > + if (napi_pp_get_page(page))
> > return;
> > #endif
> > get_page(page);
> > @@ -69,7 +69,7 @@ static inline void
> > skb_page_unref(struct page *page, bool recycle)
> > {
> > #ifdef CONFIG_PAGE_POOL
> > - if (recycle && napi_pp_put_page(page))
> > + if (napi_pp_put_page(page))
> > return;
> > #endif
> > put_page(page);
> >
> >
> This is option 2. I thought this would fix everything. But I just tested and
> it's not the case: we are now reaching a negative pp_ref_count. So probably
> somewhere a regular page reference is still being taken...
>
I would guess the most likely root cause of this would be a call site
that does get_page() instead of skb_frag_ref(), right?
The other possibility would be if something like:
- page is not pp_page
- skb_page_ref(page) // obtains a regular reference.
- page is converted to pp_page
- skb_page_unref(page) // drops a pp reference.
But I'm not aware of non-pp pages ever being converted to pp pages.
You probably figured this out already, but if you would like to dig
further instead of reverting the offending patch, this diff would
probably catch the get_page() callsite, no? (on my test setup this
debug code doesn't trigger).
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7b0ee64225de..a22a676f4b6b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1473,8 +1473,14 @@ static inline void folio_get(struct folio *folio)
folio_ref_inc(folio);
}
+static inline bool debug_is_pp_page(struct page *page)
+{
+ return (page->pp_magic & ~0x3UL) == PP_SIGNATURE;
+}
+
static inline void get_page(struct page *page)
{
+ WARN_ON_ONCE(debug_is_pp_page(page));
folio_get(page_folio(page));
}
@@ -1569,6 +1575,8 @@ static inline void put_page(struct page *page)
{
struct folio *folio = page_folio(page);
+ WARN_ON_ONCE(debug_is_pp_page(page));
+
/*
* For some devmap managed pages we need to catch refcount transition
* from 2 to 1:
--
Thanks,
Mina
Powered by blists - more mailing lists