[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9848F2DB572E5649BA045B288BE08FBE015D5D24@039-SN2MPN1-023.039d.mgd.msft.net>
Date: Mon, 29 Jul 2013 02:14:10 +0000
From: Duan Fugang-B38611 <B38611@...escale.com>
To: Ben Hutchings <bhutchings@...arflare.com>
CC: Stephen Hemminger <stephen@...workplumber.org>,
Uwe Kleine-König
<u.kleine-koenig@...gutronix.de>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
"David S. Miller" <davem@...emloft.net>,
Estevam Fabio-R49496 <r49496@...escale.com>,
Li Frank-B20596 <B20596@...escale.com>,
Shawn Guo <shawn.guo@...aro.org>,
"kernel@...gutronix.de" <kernel@...gutronix.de>,
Hector Palacios <hector.palacios@...i.com>,
Tim Sander <tim.sander@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Thomas Gleixner <tglx@...utronix.de>
Subject: RE: [PATCH] net/fec: call netif_carrier_off when not having link
On Fri, 2013-07-26 at 11:32 PM, Ben Hutchings wrote:
>netif_stop_queue() *must not* be called before netif_carrier_off(), otherwise the TX watchdog can fire immediately.
>The TX watchdog only knows when the last packet was passed to the driver, not when the queue was stopped.
>The last packet could have been added an arbitrarily long time before the link went down, therefore it may appear that the timeout has already expired..
>
>Although it is safe to call netif_stop_queue() after netif_carrier_off(), it is not useful.
>netif_stop_queue() should only be called from your ndo_start_xmit operation and only because the queue is full.
>Any other reason to stop should be communicated to the kernel using netif_carrier_off() or netif_device_detach().
>
>Ben.
Agree.
I remember you said:
The watchdog fires when the software queue has been stopped *and* the link has been reported as up for over dev->watchdog_timeo ticks.
The software queue should be stopped if the hardware queue is full or nearly full. If the software queue remains stopped and the link is still reported up, then one of these things is happening:
1. The link went down but the driver didn't notice, or sent a transmit packet which never completes 2. TX completions are not being indicated or handled correctly 3. The hardware TX path has locked up 4. The link is stalled by excessive pause frames or collisions 5. Timeout is too low and/or low watermark is too high (there may be other explanations)
The watchdog is primarily meant to deal with case 3, though all of cases 1-3 may be worked around by resetting the hardware.
Thanks,
Andy
Powered by blists - more mailing lists