lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-Id: <20250604005531.4178547-38-sashal@kernel.org>
Date: Tue,  3 Jun 2025 20:54:21 -0400
From: Sasha Levin <sashal@...nel.org>
To: patches@...ts.linux.dev,
	stable@...r.kernel.org
Cc: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	Ilias Apalodimas <ilias.apalodimas@...aro.org>,
	Paolo Abeni <pabeni@...hat.com>,
	Sasha Levin <sashal@...nel.org>,
	clrkwllms@...nel.org,
	rostedt@...dmis.org,
	netdev@...r.kernel.org,
	linux-rt-devel@...ts.linux.dev
Subject: [PATCH AUTOSEL 6.14 038/108] net: page_pool: Don't recycle into cache on PREEMPT_RT

From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>

[ Upstream commit 32471b2f481dea8624f27669d36ffd131d24b732 ]

With preemptible softirq and no per-CPU locking in local_bh_disable() on
PREEMPT_RT the consumer can be preempted while a skb is returned.

Avoid the race by disabling the recycle into the cache on PREEMPT_RT.

Cc: Jesper Dangaard Brouer <hawk@...nel.org>
Cc: Ilias Apalodimas <ilias.apalodimas@...aro.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
Link: https://patch.msgid.link/20250512092736.229935-2-bigeasy@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@...hat.com>
Signed-off-by: Sasha Levin <sashal@...nel.org>
---

**YES** This commit should be backported to stable kernel trees. Here's
my detailed analysis: ## Bug Fix Classification This is a **race
condition fix** for PREEMPT_RT kernels that prevents potential memory
corruption and system instability. The commit addresses a fundamental
concurrency issue in the page_pool subsystem that only manifests on
real-time kernels. ## Technical Analysis of the Race Condition The issue
occurs in the `page_pool_napi_local()` function, which determines
whether pages can be recycled directly into the page pool's fast cache
(`pool->alloc.cache[]`). This lockless cache provides optimal
performance by avoiding spinlock overhead. **On regular kernels:** -
Softirqs are non-preemptible - Direct cache access is safe because
producer and consumer cannot run concurrently **On PREEMPT_RT kernels:**
- Softirqs can be preempted by higher priority tasks - A softirq
returning pages can be interrupted while another context allocates from
the same cache - This creates a classic race condition on the shared
cache data structure ## Code Change Analysis The fix is minimal and
surgical: ```c /bin /bin.usr-is-merged /boot /dev /etc /home /init /lib
/lib.usr-is-merged /lib64 /lost+found /media /mnt /opt /proc /root /run
/sbin /sbin.usr-is-merged /snap /srv /sys /tmp /usr /var On PREEMPT_RT
the softirq can be preempted by the consumer linux/ if
(IS_ENABLED(CONFIG_PREEMPT_RT)) return false; ``` This forces all page
recycling on PREEMPT_RT to use the ring buffer path (`ptr_ring`) which
has proper locking, instead of the lockless direct cache. ## Impact
Assessment **Functional Impact:** Zero - pages are still recycled
correctly, just through a different path **Performance Impact:** Minimal
- ring buffer operations are slightly slower than direct cache access,
but only affects PREEMPT_RT systems **Risk:** Very low - the change is
isolated, well-understood, and follows established kernel patterns ##
Backporting Suitability 1. **Clear bug fix:** Addresses a real race
condition that could cause memory corruption 2. **Minimal and
contained:** Single 4-line change with no dependencies 3. **High
importance for affected systems:** Critical for PREEMPT_RT system
stability 4. **Low regression risk:** Fallback to existing, proven ring
buffer mechanism 5. **Follows established patterns:** Similar PREEMPT_RT
guards exist throughout the networking stack ## Historical Context Based
on my analysis of the kernel repository, the page_pool infrastructure
has evolved significantly, with the direct caching mechanism being added
for performance optimization. The `page_pool_napi_local()` function was
introduced in commit 4a96a4e807c3 (Linux 6.9+) as part of the lockless
caching optimization. This fix addresses an oversight in that
optimization where PREEMPT_RT preemption semantics weren't considered.
## Conclusion This commit represents exactly the type of fix that stable
kernels should include: a focused, low-risk correction of a race
condition that could cause system instability on specific
configurations. While it only affects PREEMPT_RT systems, the potential
consequences (memory corruption, crashes) are severe enough to warrant
backporting to any stable tree that supports PREEMPT_RT and contains the
page_pool caching infrastructure.

 net/core/page_pool.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index cca51aa2e876f..68e7962daa08f 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -801,6 +801,10 @@ static bool page_pool_napi_local(const struct page_pool *pool)
 	const struct napi_struct *napi;
 	u32 cpuid;
 
+	/* On PREEMPT_RT the softirq can be preempted by the consumer */
+	if (IS_ENABLED(CONFIG_PREEMPT_RT))
+		return false;
+
 	if (unlikely(!in_softirq()))
 		return false;
 
-- 
2.39.5


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ