lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200707191144.24434.olaf.kirch@oracle.com>
Date:	Thu, 19 Jul 2007 11:44:22 +0200
From:	Olaf Kirch <olaf.kirch@...cle.com>
To:	Ingo Molnar <mingo@...e.hu>
Cc:	Jarek Poplawski <jarkao2@...pl>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	linux-kernel@...r.kernel.org, davem@...emloft.net
Subject: Re: [patch] revert: [NET]: Fix races in net_rx_action vs netpoll

On Thursday 19 July 2007 11:09, Ingo Molnar wrote:
> the e1000 in this laptop is historically pretty robust. The only problem 
> i ever had with it were some rx/tx hw-engine latency problems [pings 
> from the outside took up to 1 second to propagate] that were quickly 
> fixed by the e1000 driver guys. Maybe that's related. (although it never 
> caused total inavailability of networking - it was only latency 
> problems)

I've been poring over this code for 3 days now, and I'm facing a blank
wall, mind-wise :-)

 -	it is pretty clear that net_rx_action is invoked every once
	in a while only. netdev watchdog timeouts are a pretty
	unmistakable sign for that.

 -	You say that netconsole output continues to trickle after
	the network gets wedged. This could be caused by the
	e1000 watchdog, which triggers a NIC interrupt "to ensure
	rx ring is cleaned". I assume that this triggers the
	regular e1000_intr, which succeeds in putting the NIC on
	the poll_list, and net_rx_action call dev->poll once.

	If this assumption is true, this means that
	 -	once an interrupt gets through, NAPI is working
		as designed
	 -	no other interrupts are arriving (Rx, Tx-completion)

So, can you verify whether there are any interrupts arriving on the
NIC after the network got wedged? You could also try
ethtool -s eth0 msglevel 65535 - would be interesting to see what
dmesg contains. If there's little to no debug output from the
driver, let it run for 10 seconds or so, in order to catch the
e1000 watchdog timer a few times.

Olaf
-- 
Olaf Kirch  |  --- o --- Nous sommes du soleil we love when we play
okir@....de |    / | \   sol.dhoop.naytheet.ah kin.ir.samse.qurax
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ