[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ncerbfkwxgdwvu57kmbdvtndc6ruxhwlbsugxzx7xnyjg5f6rv@x2rqjadywnuk>
Date: Tue, 23 Sep 2025 15:23:02 +0000
From: Dragos Tatulea <dtatulea@...dia.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: Jesper Dangaard Brouer <hawk@...nel.org>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>, Steven Rostedt <rostedt@...dmis.org>, netdev@...r.kernel.org,
Tariq Toukan <tariqt@...dia.com>, linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev
Subject: Re: [PATCH net-next] page_pool: add debug for release to cache from
wrong CPU
On Mon, Sep 22, 2025 at 04:18:27PM -0700, Jakub Kicinski wrote:
> On Sat, 20 Sep 2025 09:25:31 +0000 Dragos Tatulea wrote:
> > > The patch seems half-baked. If the NAPI local recycling is incorrect
> > > the pp will leak a reference and live forever. Which hopefully people
> > > would notice. Are you adding this check just to double confirm that
> > > any leaks you're chasing are in the driver, and not in the core?
> >
> > The point is not to chase leaks but races from doing a recycle to cache
> > from the wrong CPU. This is how XDP issue was caught where
> > xdp_set_return_frame_no_direct() was not set appropriately for cpumap [1].
> >
> > My first approach was to __page_pool_put_page() but then I figured that
> > the warning should live closer to where the actual assignment happens.
> >
> > [1] https://lore.kernel.org/all/e60404e2-4782-409f-8596-ae21ce7272c4@kernel.org/
>
> Ah, that thing. I wonder whether the complexity in the driver-facing
> xdp_return API is really worth the gain here. IIUC we want to extract
> the cases where we're doing local recycling and let those cases use
> the lockless cache. But all those cases should be caught by automatic
> local recycling detection, so caller can just pass false..
>
This patch was simply adding the debugging code to catch the potential
misuse from any callers.
I was planning to send another patch for the xdp_return() API part
once/if this one got accepted. If it makes more sense I can bundle them
together in a RFC (as merge window is coming).
Thanks,
Dragos
Powered by blists - more mailing lists