lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 29 Jul 2013 02:14:10 +0000
From:	Duan Fugang-B38611 <B38611@...escale.com>
To:	Ben Hutchings <bhutchings@...arflare.com>
CC:	Stephen Hemminger <stephen@...workplumber.org>,
	Uwe Kleine-König 
	<u.kleine-koenig@...gutronix.de>,
	"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
	"David S. Miller" <davem@...emloft.net>,
	Estevam Fabio-R49496 <r49496@...escale.com>,
	Li Frank-B20596 <B20596@...escale.com>,
	Shawn Guo <shawn.guo@...aro.org>,
	"kernel@...gutronix.de" <kernel@...gutronix.de>,
	Hector Palacios <hector.palacios@...i.com>,
	Tim Sander <tim.sander@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: RE: [PATCH] net/fec: call netif_carrier_off when not having link

On Fri, 2013-07-26 at 11:32 PM, Ben Hutchings wrote:
>netif_stop_queue() *must not* be called before netif_carrier_off(), otherwise the TX watchdog can fire immediately.
>The TX watchdog only knows when the last packet was passed to the driver, not when the queue was stopped.
>The last packet could have been added an arbitrarily long time before the link went down, therefore it may appear that the timeout has already expired..
>
>Although it is safe to call netif_stop_queue() after netif_carrier_off(), it is not useful.
>netif_stop_queue() should only be called from your ndo_start_xmit operation and only because the queue is full. 
>Any other reason to stop should be communicated to the kernel using netif_carrier_off() or netif_device_detach().
>
>Ben.

Agree.
I remember you said:
The watchdog fires when the software queue has been stopped *and* the link has been reported as up for over dev->watchdog_timeo ticks.
The software queue should be stopped if the hardware queue is full or nearly full.  If the software queue remains stopped and the link is still reported up, then one of these things is happening:

1. The link went down but the driver didn't notice, or sent a transmit packet which never completes 2. TX completions are not being indicated or handled correctly 3. The hardware TX path has locked up 4. The link is stalled by excessive pause frames or collisions 5. Timeout is too low and/or low watermark is too high (there may be other explanations)

The watchdog is primarily meant to deal with case 3, though all of cases 1-3 may be worked around by resetting the hardware.


Thanks,
Andy

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ