[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <182ceffb-b038-4c4f-9c3b-383351a043d5@kernel.org>
Date: Sat, 20 Sep 2025 19:36:49 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Dragos Tatulea <dtatulea@...dia.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>,
Ilias Apalodimas <ilias.apalodimas@...aro.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Clark Williams <clrkwllms@...nel.org>, Steven Rostedt <rostedt@...dmis.org>,
Mina Almasry <almasrymina@...gle.com>
Cc: netdev@...r.kernel.org, Tariq Toukan <tariqt@...dia.com>,
linux-kernel@...r.kernel.org, linux-rt-devel@...ts.linux.dev
Subject: Re: [PATCH net-next] page_pool: add debug for release to cache from
wrong CPU
On 18/09/2025 10.48, Dragos Tatulea wrote:
> Direct page releases to cache must be done on the same CPU as where NAPI
> is running. Not doing so results in races that are quite difficult to
> debug.
>
> This change adds a debug configuration which issues a warning when
> such buggy behaviour is encountered.
>
> Signed-off-by: Dragos Tatulea<dtatulea@...dia.com>
> Reviewed-by: Tariq Toukan<tariqt@...dia.com>
> ---
> net/Kconfig.debug | 10 +++++++
> net/core/page_pool.c | 66 ++++++++++++++++++++++++++------------------
> 2 files changed, 49 insertions(+), 27 deletions(-)
>
[...]
> @@ -768,6 +795,18 @@ static bool page_pool_recycle_in_cache(netmem_ref netmem,
> return false;
> }
>
> +#ifdef CONFIG_DEBUG_PAGE_POOL_CACHE_RELEASE
> + if (unlikely(!page_pool_napi_local(pool))) {
> + u32 pp_cpuid = READ_ONCE(pool->cpuid);
> + u32 cpuid = smp_processor_id();
> +
> + WARN_RATELIMIT(1, "page_pool %d: direct page release from wrong CPU %d, expected CPU %d",
> + pool->user.id, cpuid, pp_cpuid);
> +
> + return false;
> + }
> +#endif
The page_pool_recycle_in_cache() is an extreme fast-path for page_pool.
I know this is a debugging patch, but I would like to know the overhead
this adds (when enabled, compared to not enabled).
We (Mina) recently added a benchmark module for page_pool
under tools/testing/selftests/net/bench/page_pool/ that you can use.
Adding a WARN in fast-path code need extra careful review (maybe is it
okay here), this is because it adds an asm instruction (on Intel CPUs
ud2) what influence instruction cache prefetching. Looks like this only
gets inlined two places (page_pool_put_unrefed_netmem and
page_pool_put_page_bulk), and it might be okay... I think it is.
See how I worked around this in commit 34cc0b338a61 ("xdp: Xdp_frame add
member frame_sz and handle in convert_to_xdp_frame").
--Jesper
Powered by blists - more mailing lists