[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CO2PR11MB0088E7024FA29C309D20B655973A0@CO2PR11MB0088.namprd11.prod.outlook.com>
Date: Wed, 6 Jul 2016 06:42:57 +0000
From: Yuval Mintz <Yuval.Mintz@...gic.com>
To: Saeed Mahameed <saeedm@....mellanox.co.il>,
Eric Dumazet <eric.dumazet@...il.com>
CC: Saeed Mahameed <saeedm@...lanox.com>,
David Miller <davem@...emloft.net>,
netdev <netdev@...r.kernel.org>,
Eric Dumazet <edumazet@...gle.com>,
"Tom Herbert" <tom@...bertland.com>,
Mohamad Haj Yahia <mohamad@...lanox.com>
Subject: RE: [PATCH net] net: poll tx timeout only on active tx queues
> >> > currently all the device driver call
> >> > netif_tx_start_all_queues(dev) on open to W/A this issue. which is
> >> > strange since only real_num_tx_queues are active.
> >>
> >> You could also argue that netif_tx_start_all_queues() should only
> >> enable the real_num_tx_queues.
> >> [Although that would obviously cause all drivers to reach the
> >> 'problem' you're currently fixing].
> >
> > Yep. Basically what I pointed out.
> >
> > It seems inconsistent to have loops using num_tx_queues, and others
> > using real_num_tx_queues.
> >
> > Instead of 'fixing' one of them, we should take a deeper look, even if
> > the change looks fine.
> >
> > num_tx_queues should be used in code that runs once, like
> > netdev_lockdep_set_classes(), but other loops should probably use
> > real_num_tx_queues.
> >
> > Anyway all these changes should definitely target net-next, not net
> > tree.
> >
>
> But for the long term, you have a point.
> We will consider a deeper fix for net-next as you suggested, and drop this
> temporary fix.
I think we've actually managed to hit an issue with qede [& modified bnx2x]
due to netif_tx_start_all_queues() starting all Tx-queues -
While reducing the number of channels on an interface driver reloads
following which the xmit function receives an SKB using a too-high txq.
Investigation seem to indicate that some TCP traffic arrived during the
reload, got enqueued on the qdisc with high txq and then got transmitted
as-is after re-enabling tx.
[Removing the modulo from bnx2x's select_queue() lead to same issue.]
Powered by blists - more mailing lists