[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1620868260-32984-4-git-send-email-linyunsheng@huawei.com>
Date: Thu, 13 May 2021 09:11:00 +0800
From: Yunsheng Lin <linyunsheng@...wei.com>
To: <davem@...emloft.net>, <kuba@...nel.org>
CC: <olteanv@...il.com>, <ast@...nel.org>, <daniel@...earbox.net>,
<andriin@...com>, <edumazet@...gle.com>, <weiwan@...gle.com>,
<cong.wang@...edance.com>, <ap420073@...il.com>,
<netdev@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
<linuxarm@...neuler.org>, <mkl@...gutronix.de>,
<linux-can@...r.kernel.org>, <jhs@...atatu.com>,
<xiyou.wangcong@...il.com>, <jiri@...nulli.us>,
<andrii@...nel.org>, <kafai@...com>, <songliubraving@...com>,
<yhs@...com>, <john.fastabend@...il.com>, <kpsingh@...nel.org>,
<bpf@...r.kernel.org>, <jonas.bonn@...rounds.com>,
<pabeni@...hat.com>, <mzhivich@...mai.com>, <johunt@...mai.com>,
<albcamus@...il.com>, <kehuan.feng@...il.com>,
<a.fatoum@...gutronix.de>, <atenart@...nel.org>,
<alexander.duyck@...il.com>, <hdanton@...a.com>, <jgross@...e.com>,
<JKosina@...e.com>, <mkubecek@...e.cz>, <bjorn@...nel.org>,
<alobakin@...me>
Subject: [PATCH net v7 3/3] net: sched: fix tx action reschedule issue with stopped queue
The netdev qeueue might be stopped when byte queue limit has
reached or tx hw ring is full, net_tx_action() may still be
rescheduled endlessly if STATE_MISSED is set, which consumes
a lot of cpu without dequeuing and transmiting any skb because
the netdev queue is stopped, see qdisc_run_end().
This patch fixes it by checking the netdev queue state before
calling qdisc_run() and clearing STATE_MISSED if netdev queue is
stopped during qdisc_run(), the net_tx_action() is recheduled
again when netdev qeueue is restarted, see netif_tx_wake_queue().
As there is time window betewwn netif_xmit_frozen_or_stopped()
checking and STATE_MISSED clearing, between which STATE_MISSED
is set by net_tx_action() scheduled by netif_tx_wake_queue(),
so set the STATE_MISSED again if netdev queue is restarted.
Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking")
Reported-by: Michal Kubecek <mkubecek@...e.cz>
Signed-off-by: Yunsheng Lin <linyunsheng@...wei.com>
---
V7: Fix the netif_tx_wake_queue() data race noted by Jakub.
V6: Drop NET_XMIT_DROP checking for it is not really relevant
to this patch, and it may cause performance performance
regression with multi pktgen threads on dummy netdev with
pfifo_fast qdisc case.
---
net/core/dev.c | 3 ++-
net/sched/sch_generic.c | 27 ++++++++++++++++++++++++++-
2 files changed, 28 insertions(+), 2 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c
index d596cd7..ef8cf76 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3853,7 +3853,8 @@ static inline int __dev_xmit_skb(struct sk_buff *skb, struct Qdisc *q,
if (q->flags & TCQ_F_NOLOCK) {
rc = q->enqueue(skb, q, &to_free) & NET_XMIT_MASK;
- qdisc_run(q);
+ if (likely(!netif_xmit_frozen_or_stopped(txq)))
+ qdisc_run(q);
if (unlikely(to_free))
kfree_skb_list(to_free);
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index d86c4cc..b129d9b 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -35,6 +35,25 @@
const struct Qdisc_ops *default_qdisc_ops = &pfifo_fast_ops;
EXPORT_SYMBOL(default_qdisc_ops);
+static void qdisc_maybe_stop_tx(struct Qdisc *q,
+ const struct netdev_queue *txq)
+{
+ clear_bit(__QDISC_STATE_MISSED, &q->state);
+
+ /* Make sure the below netif_xmit_frozen_or_stopped()
+ * checking happens after clearing STATE_MISSED.
+ */
+ smp_mb__after_atomic();
+
+ /* Checking netif_xmit_frozen_or_stopped() again to
+ * make sure STATE_MISSED is set if the STATE_MISSED
+ * set by netif_tx_wake_queue()'s rescheduling of
+ * net_tx_action() is cleared by the above clear_bit().
+ */
+ if (!netif_xmit_frozen_or_stopped(txq))
+ set_bit(__QDISC_STATE_MISSED, &q->state);
+}
+
/* Main transmission queue. */
/* Modifications to data participating in scheduling must be protected with
@@ -74,6 +93,7 @@ static inline struct sk_buff *__skb_dequeue_bad_txq(struct Qdisc *q)
}
} else {
skb = SKB_XOFF_MAGIC;
+ qdisc_maybe_stop_tx(q, txq);
}
}
@@ -242,6 +262,7 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate,
}
} else {
skb = NULL;
+ qdisc_maybe_stop_tx(q, txq);
}
if (lock)
spin_unlock(lock);
@@ -251,8 +272,10 @@ static struct sk_buff *dequeue_skb(struct Qdisc *q, bool *validate,
*validate = true;
if ((q->flags & TCQ_F_ONETXQUEUE) &&
- netif_xmit_frozen_or_stopped(txq))
+ netif_xmit_frozen_or_stopped(txq)) {
+ qdisc_maybe_stop_tx(q, txq);
return skb;
+ }
skb = qdisc_dequeue_skb_bad_txq(q);
if (unlikely(skb)) {
@@ -311,6 +334,8 @@ bool sch_direct_xmit(struct sk_buff *skb, struct Qdisc *q,
HARD_TX_LOCK(dev, txq, smp_processor_id());
if (!netif_xmit_frozen_or_stopped(txq))
skb = dev_hard_start_xmit(skb, dev, txq, &ret);
+ else
+ qdisc_maybe_stop_tx(q, txq);
HARD_TX_UNLOCK(dev, txq);
} else {
--
2.7.4
Powered by blists - more mailing lists