[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220727152000.3616086-1-vladimir.oltean@nxp.com>
Date: Wed, 27 Jul 2022 18:20:00 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: netdev@...r.kernel.org
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
Jonathan Toppins <jtoppins@...hat.com>,
Jay Vosburgh <j.vosburgh@...il.com>,
Veaceslav Falico <vfalico@...il.com>,
Hangbin Liu <liuhangbin@...il.com>,
Jamal Hadi Salim <jhs@...atatu.com>,
Cong Wang <xiyou.wangcong@...il.com>,
Jiri Pirko <jiri@...nulli.us>,
Brian Hutchinson <b.hutchman@...il.com>
Subject: [PATCH v2 net] net/sched: make dev_trans_start() have a better chance of working with stacked interfaces
Documentation/networking/bonding.rst points out that for ARP monitoring
to work, dev_trans_start() must be able to verify the latest trans_start
update of any slave_dev TX queue. However, with NETIF_F_LLTX,
dev_trans_start() simply doesn't make much sense.
DSA has declared NETIF_F_LLTX to be in line with other stackable
interfaces, and this has introduced a regression in the form of breaking
ARP monitoring with bonding.
There is a workaround already in place in dev_trans_start() to fix just
this kind of breakage for non-stacked cases of vlan and macvlan. Since
DSA doesn't export any flag which says "this interface is DSA", or "this
interface's master is this device", we need to generalize this logic by
traversing the netdev adjacency lists, so that DSA is also covered.
Link to the discussion on a previous approach:
https://patchwork.kernel.org/project/netdevbpf/patch/20220715232641.952532-1-vladimir.oltean@nxp.com/
Fixes: 2b86cb829976 ("net: dsa: declare lockless TX feature for slave ports")
Reported-by: Brian Hutchinson <b.hutchman@...il.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
---
net/sched/sch_generic.c | 37 +++++++++++++++++++++++++++++++++----
1 file changed, 33 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index cc6eabee2830..fb964e2b0436 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -427,20 +427,49 @@ void __qdisc_run(struct Qdisc *q)
unsigned long dev_trans_start(struct net_device *dev)
{
+ struct net_device *lower;
+ struct list_head *iter;
unsigned long val, res;
+ bool have_lowers;
unsigned int i;
- if (is_vlan_dev(dev))
- dev = vlan_dev_real_dev(dev);
- else if (netif_is_macvlan(dev))
- dev = macvlan_dev_real_dev(dev);
+ rcu_read_lock();
+
+ /* Stacked network interfaces usually have NETIF_F_LLTX so
+ * netdev_start_xmit() -> txq_trans_update() fails to do anything,
+ * because they don't lock the TX queue. Calling dev_trans_start() on a
+ * virtual device makes little sense, since it is a mechanism intended
+ * for the TX watchdog. That notwithstanding, layers such as the
+ * bonding arp monitor may still use dev_trans_start() on slave
+ * interfaces, probably to see if any transmission took place in the
+ * last ARP interval. This use is antiquated, however we don't know
+ * what to replace it with. While we can't solve the general case of
+ * virtual interfaces, for stackable ones (vlan, macvlan, DSA or
+ * potentially stacked combinations), we can work around by returning
+ * the trans_start of the physical, real device backing them. In this
+ * case, walk the adjacency lists all the way down, hoping that the
+ * lower-most device won't have NETIF_F_LLTX.
+ */
+ do {
+ have_lowers = false;
+
+ netdev_for_each_lower_dev(dev, lower, iter) {
+ have_lowers = true;
+ dev = lower;
+ break;
+ }
+ } while (have_lowers);
+
res = READ_ONCE(netdev_get_tx_queue(dev, 0)->trans_start);
+
for (i = 1; i < dev->num_tx_queues; i++) {
val = READ_ONCE(netdev_get_tx_queue(dev, i)->trans_start);
if (val && time_after(val, res))
res = val;
}
+ rcu_read_unlock();
+
return res;
}
EXPORT_SYMBOL(dev_trans_start);
--
2.34.1
Powered by blists - more mailing lists