[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CANn89iJuEVe72bPmEftyEJHLzzN=QNR2yueFjTxYXCEpS5S8HQ@mail.gmail.com>
Date: Thu, 20 Apr 2023 13:16:54 +0200
From: Eric Dumazet <edumazet@...gle.com>
To: Jesper Dangaard Brouer <brouer@...hat.com>
Cc: netdev@...r.kernel.org, pabeni@...hat.com, kuba@...nel.org,
hawk@...nel.org, davem@...emloft.net, lorenzo@...nel.org
Subject: Re: [PATCH net-next V1] net: flush sd->defer_list on unregister_netdevice
On Thu, Apr 20, 2023 at 12:06 PM Jesper Dangaard Brouer
<brouer@...hat.com> wrote:
>
> When removing a net_device (that use NAPI), the sd->defer_list
Why "(that use NAPI)" is relevant ?
> system can still hold on to SKBs that have a dst_entry which can
> have a netdev_hold reference.
This would be quite a bug really. What makes you think this ?
In order to validate this please post a stack trace if you get a
warning from this debug patch.
(make sure to build with CONFIG_DEBUG_NET=y)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 768f9d04911fb16a40e057aec7bfa2381a40d7a7..56c79bf922ab4045984513de13cc946b84fd3675
100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6828,6 +6828,7 @@ nodefer: __kfree_skb(skb);
return;
}
+ DEBUG_NET_WARN_ON_ONCE(skb_dst(skb));
sd = &per_cpu(softnet_data, cpu);
defer_max = READ_ONCE(sysctl_skb_defer_max);
if (READ_ONCE(sd->defer_count) >= defer_max)
>
> Choose simple solution of flushing the softnet_data defer_list
> system as part of unregister_netdevice flush_all_backlogs().
I do not know if adding two conditional tests in the fast path is
worth the pain.
I was thinking of simplifying rules for defer_lock acquisition anyway [1]
If you plan a V2, I would advise to block BH before calling
skb_defer_free_flush.
[1]
diff --git a/net/core/dev.c b/net/core/dev.c
index 3fc4dba71f9dd250c59c0a070566791f0cd27ec4..be68b8f7fe16658f3304696bf3c83df4c8af51c7
100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6631,11 +6631,11 @@ static void skb_defer_free_flush(struct
softnet_data *sd)
if (!READ_ONCE(sd->defer_list))
return;
- spin_lock_irq(&sd->defer_lock);
+ spin_lock(&sd->defer_lock);
skb = sd->defer_list;
sd->defer_list = NULL;
sd->defer_count = 0;
- spin_unlock_irq(&sd->defer_lock);
+ spin_unlock(&sd->defer_lock);
while (skb != NULL) {
next = skb->next;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 768f9d04911fb16a40e057aec7bfa2381a40d7a7..27f6e0042dfe0cbbb362d06b540b3ca1567459af
100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -6817,7 +6817,6 @@ void skb_attempt_defer_free(struct sk_buff *skb)
{
int cpu = skb->alloc_cpu;
struct softnet_data *sd;
- unsigned long flags;
unsigned int defer_max;
bool kick;
@@ -6833,7 +6832,10 @@ nodefer: __kfree_skb(skb);
if (READ_ONCE(sd->defer_count) >= defer_max)
goto nodefer;
- spin_lock_irqsave(&sd->defer_lock, flags);
+ /* Strictly speaking, we do not need to block BH to acquire
+ * this spinlock, but lockdep disagrees (unless we add classes)
+ */
+ spin_lock_bh(&sd->defer_lock);
/* Send an IPI every time queue reaches half capacity. */
kick = sd->defer_count == (defer_max >> 1);
/* Paired with the READ_ONCE() few lines above */
@@ -6842,7 +6844,7 @@ nodefer: __kfree_skb(skb);
skb->next = sd->defer_list;
/* Paired with READ_ONCE() in skb_defer_free_flush() */
WRITE_ONCE(sd->defer_list, skb);
- spin_unlock_irqrestore(&sd->defer_lock, flags);
+ spin_unlock_bh(&sd->defer_lock);
/* Make sure to trigger NET_RX_SOFTIRQ on the remote CPU
* if we are unlucky enough (this seems very unlikely).
Powered by blists - more mailing lists