[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <76d3fe0c-b031-4c8f-91c0-386b169384fb@kernel.org>
Date: Tue, 5 Aug 2025 10:03:02 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>, davem@...emloft.net
Cc: netdev@...r.kernel.org, edumazet@...gle.com, pabeni@...hat.com,
andrew+netdev@...n.ch, horms@...nel.org, David Wei <dw@...idwei.uk>,
michael.chan@...adcom.com, pavan.chebbi@...adcom.com,
ilias.apalodimas@...aro.org, almasrymina@...gle.com, sdf@...ichev.me
Subject: Re: [PATCH net v2] net: page_pool: allow enabling recycling late, fix
false positive warning
On 05/08/2025 02.36, Jakub Kicinski wrote:
> Page pool can have pages "directly" (locklessly) recycled to it,
> if the NAPI that owns the page pool is scheduled to run on the same CPU.
> To make this safe we check that the NAPI is disabled while we destroy
> the page pool. In most cases NAPI and page pool lifetimes are tied
> together so this happens naturally.
>
> The queue API expects the following order of calls:
> -> mem_alloc
> alloc new pp
> -> stop
> napi_disable
> -> start
> napi_enable
> -> mem_free
> free old pp
>
> Here we allocate the page pool in ->mem_alloc and free in ->mem_free.
> But the NAPIs are only stopped between ->stop and ->start. We created
> page_pool_disable_direct_recycling() to safely shut down the recycling
> in ->stop. This way the page_pool_destroy() call in ->mem_free doesn't
> have to worry about recycling any more.
>
> Unfortunately, the page_pool_disable_direct_recycling() is not enough
> to deal with failures which necessitate freeing the_new_ page pool.
> If we hit a failure in ->mem_alloc or ->stop the new page pool has
> to be freed while the NAPI is active (assuming driver attaches the
> page pool to an existing NAPI instance and doesn't reallocate NAPIs).
>
> Freeing the new page pool is technically safe because it hasn't been
> used for any packets, yet, so there can be no recycling. But the check
> in napi_assert_will_not_race() has no way of knowing that. We could
> check if page pool is empty but that'd make the check much less likely
> to trigger during development.
>
> Add page_pool_enable_direct_recycling(), pairing with
> page_pool_disable_direct_recycling(). It will allow us to create the new
> page pools in "disabled" state and only enable recycling when we know
> the reconfig operation will not fail.
>
> Coincidentally it will also let us re-enable the recycling for the old
> pool, if the reconfig failed:
>
> -> mem_alloc (new)
> -> stop (old)
> # disables direct recycling for old
> -> start (new)
> # fail!!
> -> start (old)
> # go back to old pp but direct recycling is lost 🙁
> -> mem_free (new)
>
> The new helper is idempotent to make the life easier for drivers,
> which can operate in HDS mode and support zero-copy Rx.
> The driver can call the helper twice whether there are two pools
> or it has multiple references to a single pool.
>
> Fixes: 40eca00ae605 ("bnxt_en: unlink page pool when stopping Rx queue")
> Tested-by: David Wei<dw@...idwei.uk>
> Signed-off-by: Jakub Kicinski<kuba@...nel.org>
> ---
> v2:
> - add kdoc
> - WARN_ON_ONCE -> WARN_ON
> v1:https://lore.kernel.org/20250801173011.2454447-1-kuba@kernel.org
LGTM - thanks for adjusting :-)
Acked-by: Jesper Dangaard Brouer <hawk@...nel.org>
--Jesper
Powered by blists - more mailing lists