[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220516042456.3014395-2-eric.dumazet@gmail.com>
Date: Sun, 15 May 2022 21:24:53 -0700
From: Eric Dumazet <eric.dumazet@...il.com>
To: "David S . Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>
Cc: netdev <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
Eric Dumazet <eric.dumazet@...il.com>
Subject: [PATCH net-next 1/4] net: fix possible race in skb_attempt_defer_free()
From: Eric Dumazet <edumazet@...gle.com>
A cpu can observe sd->defer_count reaching 128,
and call smp_call_function_single_async()
Problem is that the remote CPU can clear sd->defer_count
before the IPI is run/acknowledged.
Other cpus can queue more packets and also decide
to call smp_call_function_single_async() while the pending
IPI was not yet delivered.
This is a common issue with smp_call_function_single_async().
Callers must ensure correct synchronization and serialization.
I triggered this issue while experimenting smaller threshold.
Performing the call to smp_call_function_single_async()
under sd->defer_lock protection did not solve the problem.
Commit 5a18ceca6350 ("smp: Allow smp_call_function_single_async()
to insert locked csd") replaced an informative WARN_ON_ONCE()
with a return of -EBUSY, which is often ignored.
Test of CSD_FLAG_LOCK presence is racy anyway.
Fixes: 68822bdf76f1 ("net: generalize skb freeing deferral to per-cpu lists")
Signed-off-by: Eric Dumazet <edumazet@...gle.com>
---
include/linux/netdevice.h | 1 +
net/core/dev.c | 7 +++++--
net/core/skbuff.c | 5 ++---
3 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 536321691c725ebd311088f4654dd04b9abbaaef..89699a299ba1b0b544b7abb782708d827abe55ac 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3127,6 +3127,7 @@ struct softnet_data {
/* Another possibly contended cache line */
spinlock_t defer_lock ____cacheline_aligned_in_smp;
int defer_count;
+ int defer_ipi_scheduled;
struct sk_buff *defer_list;
call_single_data_t defer_csd;
};
diff --git a/net/core/dev.c b/net/core/dev.c
index a601da3b4a7c800801f763f097f00f3a3b591107..d708f95356e0b03a61e8211adcf6d272dfa322b5 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4581,9 +4581,12 @@ static void rps_trigger_softirq(void *data)
#endif /* CONFIG_RPS */
/* Called from hardirq (IPI) context */
-static void trigger_rx_softirq(void *data __always_unused)
+static void trigger_rx_softirq(void *data)
{
+ struct softnet_data *sd = data;
+
__raise_softirq_irqoff(NET_RX_SOFTIRQ);
+ smp_store_release(&sd->defer_ipi_scheduled, 0);
}
/*
@@ -11381,7 +11384,7 @@ static int __init net_dev_init(void)
INIT_CSD(&sd->csd, rps_trigger_softirq, sd);
sd->cpu = i;
#endif
- INIT_CSD(&sd->defer_csd, trigger_rx_softirq, NULL);
+ INIT_CSD(&sd->defer_csd, trigger_rx_softirq, sd);
spin_lock_init(&sd->defer_lock);
init_gro_hash(&sd->backlog);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bd16e158b3668f496a9d2c8e8b6f3433a326314c..1e2180682f2e94c45e3f26059af6d18be2d9f9d3 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6514,8 +6514,7 @@ void skb_attempt_defer_free(struct sk_buff *skb)
sd->defer_count++;
/* kick every time queue length reaches 128.
- * This should avoid blocking in smp_call_function_single_async().
- * This condition should hardly be bit under normal conditions,
+ * This condition should hardly be hit under normal conditions,
* unless cpu suddenly stopped to receive NIC interrupts.
*/
kick = sd->defer_count == 128;
@@ -6525,6 +6524,6 @@ void skb_attempt_defer_free(struct sk_buff *skb)
/* Make sure to trigger NET_RX_SOFTIRQ on the remote CPU
* if we are unlucky enough (this seems very unlikely).
*/
- if (unlikely(kick))
+ if (unlikely(kick) && !cmpxchg(&sd->defer_ipi_scheduled, 0, 1))
smp_call_function_single_async(cpu, &sd->defer_csd);
}
--
2.36.0.550.gb090851708-goog
Powered by blists - more mailing lists