[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20251025160905.3857885-382-sashal@kernel.org>
Date: Sat, 25 Oct 2025 12:00:13 -0400
From: Sasha Levin <sashal@...nel.org>
To: patches@...ts.linux.dev,
stable@...r.kernel.org
Cc: Dragos Tatulea <dtatulea@...dia.com>,
Tariq Toukan <tariqt@...dia.com>,
Paolo Abeni <pabeni@...hat.com>,
Sasha Levin <sashal@...nel.org>,
hawk@...nel.org,
ilias.apalodimas@...aro.org,
netdev@...r.kernel.org
Subject: [PATCH AUTOSEL 6.17-5.4] page_pool: Clamp pool size to max 16K pages
From: Dragos Tatulea <dtatulea@...dia.com>
[ Upstream commit a1b501a8c6a87c9265fd03bd004035199e2e8128 ]
page_pool_init() returns E2BIG when the page_pool size goes above 32K
pages. As some drivers are configuring the page_pool size according to
the MTU and ring size, there are cases where this limit is exceeded and
the queue creation fails.
The page_pool size doesn't have to cover a full queue, especially for
larger ring size. So clamp the size instead of returning an error. Do
this in the core to avoid having each driver do the clamping.
The current limit was deemed to high [1] so it was reduced to 16K to avoid
page waste.
[1] https://lore.kernel.org/all/1758532715-820422-3-git-send-email-tariqt@nvidia.com/
Signed-off-by: Dragos Tatulea <dtatulea@...dia.com>
Reviewed-by: Tariq Toukan <tariqt@...dia.com>
Link: https://patch.msgid.link/20250926131605.2276734-2-dtatulea@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@...hat.com>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---
LLM Generated explanations, may be completely bogus:
YES
- The change at `net/core/page_pool.c:213` and
`net/core/page_pool.c:214` stops rejecting large pools and instead
clamps the ptr_ring allocation to 16k entries, which keeps queue setup
from failing with `-E2BIG` while still bounding the cached page
budget. Without this patch, any driver that computes a `pool_size`
above 32768 immediately aborts queue creation; for example,
`mlx5e_alloc_rq()` sets `pp_params.pool_size = pool_size` in
`drivers/net/ethernet/mellanox/mlx5/core/en_main.c:906` and again at
`drivers/net/ethernet/mellanox/mlx5/core/en_main.c:1011`, and on error
it propagates the failure (`goto err_free_by_rq_type`) so the RX queue
never comes up. `stmmac_init_rx_buffers()` follows the same pattern in
`drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:2051`–`drivers/net/
ethernet/stmicro/stmmac/stmmac_main.c:2066`, meaning larger rings or
MTU-derived pools currently make the interface unusable.
- The lower cap is safe: when the ptr_ring fills, the existing slow-path
already frees excess pages (`page_pool_recycle_in_ring()` at
`net/core/page_pool.c:746` together with the fallback in
`page_pool_put_unrefed_netmem()` at `net/core/page_pool.c:873`), so a
smaller cache only increases occasional allocations but does not
change correctness. No ABI or driver interfaces are touched, and every
driver benefits automatically without per-driver clamps.
- This is a minimal, localized fix that prevents hard user-visible
failures (device queues refusing to start) on systems with large RX
rings or jumbo MTUs, making it an excellent candidate for stable
backports.
net/core/page_pool.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index e224d2145eed9..1a5edec485f14 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -211,11 +211,7 @@ static int page_pool_init(struct page_pool *pool,
return -EINVAL;
if (pool->p.pool_size)
- ring_qsize = pool->p.pool_size;
-
- /* Sanity limit mem that can be pinned down */
- if (ring_qsize > 32768)
- return -E2BIG;
+ ring_qsize = min(pool->p.pool_size, 16384);
/* DMA direction is either DMA_FROM_DEVICE or DMA_BIDIRECTIONAL.
* DMA_BIDIRECTIONAL is for allowing page used for DMA sending,
--
2.51.0
Powered by blists - more mailing lists