[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220716002612.rd6ir65njzc2g3cc@skbuf>
Date: Sat, 16 Jul 2022 00:26:13 +0000
From: Vladimir Oltean <vladimir.oltean@....com>
To: Jakub Kicinski <kuba@...nel.org>
CC: "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>,
Vivien Didelot <vivien.didelot@...il.com>,
Florian Fainelli <f.fainelli@...il.com>,
Jonathan Toppins <jtoppins@...hat.com>,
Jay Vosburgh <j.vosburgh@...il.com>,
Veaceslav Falico <vfalico@...il.com>,
Hangbin Liu <liuhangbin@...il.com>,
Brian Hutchinson <b.hutchman@...il.com>
Subject: Re: [PATCH net] net: dsa: fix bonding with ARP monitoring by updating
trans_start manually
On Fri, Jul 15, 2022 at 05:19:59PM -0700, Jakub Kicinski wrote:
> On Sat, 16 Jul 2022 00:14:44 +0000 Vladimir Oltean wrote:
> > On Fri, Jul 15, 2022 at 05:00:42PM -0700, Jakub Kicinski wrote:
> > > On Sat, 16 Jul 2022 02:26:41 +0300 Vladimir Oltean wrote:
> > > > Documentation/networking/bonding.rst points out that for ARP monitoring
> > > > to work, dev_trans_start() must be able to verify the latest trans_start
> > > > update of any slave_dev TX queue. However, with NETIF_F_LLTX,
> > > > netdev_start_xmit() -> txq_trans_update() fails to do anything, because
> > > > the TX queue hasn't been locked.
> > > >
> > > > Fix this by manually updating the current TX queue's trans_start for
> > > > each packet sent.
> > > >
> > > > Fixes: 2b86cb829976 ("net: dsa: declare lockless TX feature for slave ports")
> > > > Reported-by: Brian Hutchinson <b.hutchman@...il.com>
> > > > Signed-off-by: Vladimir Oltean <vladimir.oltean@....com>
> > >
> > > Did you see my discussion with Jay? Let's stop the spread of this
> > > workaround, I'm tossing..
> >
> > No, I didn't, could you summarize the alternative proposal?
>
> Make bonding not depend on a field which is only valid for HW devices
> which use the Tx watchdog. Let me find the thread...
> https://lore.kernel.org/all/20220621213823.51c51326@kernel.org/
That won't work in the general case with dsa_slave_get_stats64(), which
may take the stats from hardware (delayed) or from dev_get_tstats64().
Also, not to mention that ARP monitoring used to work before the commit
I blamed, this is a punctual fix for a regression.
Powered by blists - more mailing lists