[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <24e1144d-2dcc-6b33-eec0-f668d51c7a80@redhat.com>
Date: Mon, 25 Jul 2022 17:39:33 -0400
From: Jonathan Toppins <jtoppins@...hat.com>
To: Vladimir Oltean <vladimir.oltean@....com>,
Jakub Kicinski <kuba@...nel.org>
Cc: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
Jay Vosburgh <j.vosburgh@...il.com>,
Veaceslav Falico <vfalico@...il.com>,
Hangbin Liu <liuhangbin@...il.com>,
Brian Hutchinson <b.hutchman@...il.com>
Subject: Re: [PATCH net] net: dsa: fix bonding with ARP monitoring by updating
trans_start manually
On 7/25/22 16:31, Vladimir Oltean wrote:
> Hi Jakub,
>
> On Sat, Jul 16, 2022 at 04:33:38PM -0700, Jakub Kicinski wrote:
>> On Sat, 16 Jul 2022 13:30:10 +0000 Vladimir Oltean wrote:
>>> I would need some assistance from Jay or other people more familiar with
>>> bonding to do that. I'm not exactly clear which packets the bonding
>>> driver wants to check they have been transmitted in the last interval:
>>> ARP packets? any packets?
>>
>> And why - stack has queued a packet to the driver, how is that useful
>> to assess the fact that the link is up? Can bonding check instead
>> whether any queue is running? Or maybe the whole thing is just a remnant
>> of times before the centralized watchdog tx and should be dropped?
>
> Yes, indeed, why? I don't know.
>
>>> With DSA and switchdev drivers in general,
>>> they have an offloaded forwarding path as well, so expect that what you
>>> get through ndo_get_stats64 may also report packets which egressed a
>>> physical port but weren't originated by the network stack.
>>> I simply don't know what is a viable substitute for dev_trans_start()
>>> because I don't understand very well what it intends to capture.
>>
>> Looking thru the code I stumbled on the implementation of
>> dev_trans_start() - it already takes care of some upper devices.
>> Adding DSA handling there would offend my sensibilities slightly
>> less, if you want to do that. At least it's not on the fast path.
>
> How are your sensibilities feeling about this change?
>
> diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
> index cc6eabee2830..b844eb0bde7e 100644
> --- a/net/sched/sch_generic.c
> +++ b/net/sched/sch_generic.c
> @@ -427,20 +427,43 @@ void __qdisc_run(struct Qdisc *q)
>
> unsigned long dev_trans_start(struct net_device *dev)
> {
> + struct net_device *lower;
> + struct list_head *iter;
> unsigned long val, res;
> + bool have_lowers;
> unsigned int i;
>
> - if (is_vlan_dev(dev))
> - dev = vlan_dev_real_dev(dev);
> - else if (netif_is_macvlan(dev))
> - dev = macvlan_dev_real_dev(dev);
> + rcu_read_lock();
> +
> + /* Stacked network interfaces usually have NETIF_F_LLTX so
> + * netdev_start_xmit() -> txq_trans_update() fails to do anything,
> + * because they don't lock the TX queue. However, layers such as the
> + * bonding arp monitor may still use dev_trans_start() on slave
> + * interfaces such as vlan, macvlan, DSA (or potentially stacked
> + * combinations of the above). In this case, walk the adjacency lists
> + * all the way down, hoping that the lower-most device won't have
> + * NETIF_F_LLTX.
> + */
I am not sure this assumption holds for virtual devices like veth unless
the virtual interfaces are updated to modify trans_start, see
e66e257a5d83 ("veth: Add updating of trans_start")
And not all the virtual interfaces update trans_start iirc.
> + do {
> + have_lowers = false;
> +
> + netdev_for_each_lower_dev(dev, lower, iter) {
> + have_lowers = true;
> + dev = lower;
> + break;
> + }
> + } while (have_lowers);
> +
> res = READ_ONCE(netdev_get_tx_queue(dev, 0)->trans_start);
> +
> for (i = 1; i < dev->num_tx_queues; i++) {
> val = READ_ONCE(netdev_get_tx_queue(dev, i)->trans_start);
> if (val && time_after(val, res))
> res = val;
> }
>
> + rcu_read_unlock();
> +
> return res;
> }
> EXPORT_SYMBOL(dev_trans_start);
>
Powered by blists - more mailing lists