lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Sat, 21 Feb 2015 11:15:12 +0100
From:	Tomas Szepe <szepe@...erecords.com>
To:	Florian Westphal <fw@...len.de>
Cc:	Francois Romieu <romieu@...zoreil.com>,
	Hayes Wang <hayeswang@...ltek.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Tom Herbert <therbert@...gle.com>,
	"David S. Miller" <davem@...emloft.net>,
	Marco Berizzi <pupilla@...mail.com>,
	linux-kernel@...r.kernel.org
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

> > > Since linux-3.18.0, r8169 is having problems driving one of my add-on
> > > PCIe NICs.  The interface is losing link for several seconds at a time,
> > > the frequency being about once a minute when the traffic is high.
> > > 
> > > The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> > > eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> > > subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> > > (w/o the complementary "link down" message).
> > > 
> > > I've traced the culprit down to commit 1e918876, "r8169: add support
> > > for Byte Queue Limits" by Florian Westphal <fw@...len.de>.  Reverting
> > > the patch appears to fix the problem on linux-3.18.5.
> > > The same issue might already have been reported by Marco Berizzi here:
> > > http://lkml.org/lkml/2014/12/11/65
> > 
> > Thanks for reporting this!  I'm no lkml subscriber and thus did not
> > see earlier report.
> > 
> > I'll try to reproduce this but unfortunately I am currently travelling
> > and won't have access to my r8169 nic until Feb 10th.
> 
> I tried to reproduce this without success so far on my RTL8168d/8111d device.
> I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
> last 5 hours and so far I saw no watchdog tx timeouts.
> 
> I'll keep this running for a day or so to see if it just takes more time
> to trigger.

So, how's this coming along?  Don't you think the patch should be reverted
until the problem is diagnosed/understood/fixed?

-- 
Tomas Szepe <szepe@...erecords.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ