[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250127025734.3406167-3-linyunsheng@huawei.com>
Date: Mon, 27 Jan 2025 10:57:31 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: <davem@...emloft.net>, <kuba@...nel.org>, <pabeni@...hat.com>
CC: <zhangkun09@...wei.com>, <liuyonglong@...wei.com>,
<fanghaiqing@...wei.com>, Yunsheng Lin <linyunsheng@...wei.com>, Alexander
Lobakin <aleksander.lobakin@...el.com>, Xuan Zhuo
<xuanzhuo@...ux.alibaba.com>, Jesper Dangaard Brouer <hawk@...nel.org>, Ilias
Apalodimas <ilias.apalodimas@...aro.org>, Eric Dumazet <edumazet@...gle.com>,
Simon Horman <horms@...nel.org>, <netdev@...r.kernel.org>,
<linux-kernel@...r.kernel.org>
Subject: [RFC v8 2/5] page_pool: fix timing for checking and disabling napi_local
page_pool page may be freed from skb_defer_free_flush() in
softirq context without binding to any specific napi, it
may cause use-after-free problem due to the below time window,
as below, CPU1 may still access napi->list_owner after CPU0
free the napi memory:
CPU 0 CPU1
page_pool_destroy() skb_defer_free_flush()
. .
. napi = READ_ONCE(pool->p.napi);
. .
page_pool_disable_direct_recycling() .
driver free napi memory .
. .
. napi && READ_ONCE(napi->list_owner) == cpuid
. .
Use rcu mechanism to avoid the above problem.
Note, the above was found during code reviewing on how to fix
the problem in [1].
As the following IOMMU fix patch depends on rcu synchronization
added in this patch and the time window is so small that it
doesn't seem to be an urgent fix, so target the net-next as
the IOMMU fix patch does.
1. https://lore.kernel.org/lkml/8067f204-1380-4d37-8ffd-007fc6f26738@kernel.org/T/
Fixes: dd64b232deb8 ("page_pool: unlink from napi during destroy")
Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
CC: Alexander Lobakin <aleksander.lobakin@...el.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
---
net/core/page_pool.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/net/core/page_pool.c b/net/core/page_pool.c
index f89cf93f6eb4..713502e8e8c9 100644
--- a/net/core/page_pool.c
+++ b/net/core/page_pool.c
@@ -810,6 +810,11 @@ static bool page_pool_napi_local(const struct page_pool *pool)
if (READ_ONCE(pool->cpuid) == cpuid)
return true;
+ /* The in_softirq() checking above should ensure RCU-bh read-side
+ * critical section, which is paired with the rcu sync in
+ * page_pool_destroy().
+ */
+ DEBUG_NET_WARN_ON_ONCE(!rcu_read_lock_bh_held());
napi = READ_ONCE(pool->p.napi);
return napi && READ_ONCE(napi->list_owner) == cpuid;
@@ -1126,6 +1131,12 @@ void page_pool_destroy(struct page_pool *pool)
if (!page_pool_release(pool))
return;
+ /* Paired with RCU-bh read-side critical section to enable clearing
+ * of pool->p.napi in page_pool_disable_direct_recycling() is seen
+ * before returning to driver to free the napi instance.
+ */
+ synchronize_net();
+
page_pool_detached(pool);
pool->defer_start = jiffies;
pool->defer_warn = jiffies + DEFER_WARN_INTERVAL;
--
2.33.0
Powered by blists - more mailing lists