lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 11 Feb 2015 10:46:47 +0100
From:	Tomáš Szépe <szepe@...erecords.com>
To:	Florian Westphal <fw@...len.de>
Cc:	Francois Romieu <romieu@...zoreil.com>,
	Hayes Wang <hayeswang@...ltek.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Tom Herbert <therbert@...gle.com>,
	"David S. Miller" <davem@...emloft.net>,
	Marco Berizzi <pupilla@...mail.com>,
	linux-kernel@...r.kernel.org
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Hi,

> > > Since linux-3.18.0, r8169 is having problems driving one of my add-on
> > > PCIe NICs.  The interface is losing link for several seconds at a time,
> > > the frequency being about once a minute when the traffic is high.
> > > 
> > > The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> > > eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> > > subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> > > (w/o the complementary "link down" message).
> > > 
> > > I've traced the culprit down to commit 1e918876, "r8169: add support
> > > for Byte Queue Limits" by Florian Westphal <fw@...len.de>.  Reverting
> > > the patch appears to fix the problem on linux-3.18.5.
> > > The same issue might already have been reported by Marco Berizzi here:
> > > http://lkml.org/lkml/2014/12/11/65
> > 
> > Thanks for reporting this!  I'm no lkml subscriber and thus did not
> > see earlier report.
> > 
> > I'll try to reproduce this but unfortunately I am currently travelling
> > and won't have access to my r8169 nic until Feb 10th.
> 
> I tried to reproduce this without success so far on my RTL8168d/8111d device.

I was afraid it might come to this.  Of all the ~15 r8169 interfaces (mostly
onboard) I've got running in about 10 machines, only a single one is affected.

> I've been running 40 parallel netperf TCP_STREAM tests (1gbit) for the
> last 5 hours and so far I saw no watchdog tx timeouts.
> 
> I'll keep this running for a day or so to see if it just takes more time
> to trigger.

Ok, but if the bug were to manifest itself in the same manner as it does
over here, it wouldn't require so much pressure.

> Do you use any "special" settings (e.g. offload features off, mtu > 1500, etc)?

Nothing special.

modprobe r8169
nameif eth1 xx:xx:xx:xx:xx:xx
ip link set eth1 up
ip addr add 192.168.x.y/24 brd + dev eth1
And that's it, really.

Seeya,
-- 
Tomáš Szépe <szepe@...erecords.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ