lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150203104214.GG24751@breakpoint.cc>
Date:	Tue, 3 Feb 2015 11:42:14 +0100
From:	Florian Westphal <fw@...len.de>
To:	Tomas Szepe <szepe@...erecords.com>
Cc:	Florian Westphal <fw@...len.de>,
	Francois Romieu <romieu@...zoreil.com>,
	Hayes Wang <hayeswang@...ltek.com>,
	Eric Dumazet <edumazet@...gle.com>,
	Tom Herbert <therbert@...gle.com>,
	"David S. Miller" <davem@...emloft.net>,
	Marco Berizzi <pupilla@...mail.com>,
	linux-kernel@...r.kernel.org
Subject: Re: 1e918876 breaks r8169 (linux-3.18+)

Tomas Szepe <szepe@...erecords.com> wrote:
> Hi,
> 
> Since linux-3.18.0, r8169 is having problems driving one of my add-on
> PCIe NICs.  The interface is losing link for several seconds at a time,
> the frequency being about once a minute when the traffic is high.
> 
> The first loss of link is accompanied by the message "NETDEV WATCHDOG:
> eth1 (r8169): transmit queue 0 timed out" and a call trace, while
> subsequent occurrences only report "r8169 0000:01:00.0 eth1: link up"
> (w/o the complementary "link down" message).
> 
> I've traced the culprit down to commit 1e918876, "r8169: add support
> for Byte Queue Limits" by Florian Westphal <fw@...len.de>.  Reverting
> the patch appears to fix the problem on linux-3.18.5.
> The same issue might already have been reported by Marco Berizzi here:
> http://lkml.org/lkml/2014/12/11/65

Thanks for reporting this!  I'm no lkml subscriber and thus did not
see earlier report.

I'll try to reproduce this but unfortunately I am currently travelling
and won't have access to my r8169 nic until Feb 10th.

If you're willing to experiment (not even compile tested):

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -5067,8 +5067,6 @@ static void rtl_hw_reset(struct rtl8169_private *tp)
 	RTL_W8(ChipCmd, CmdReset);
 
 	rtl_udelay_loop_wait_low(tp, &rtl_chipcmd_cond, 100, 100);
-
-	netdev_reset_queue(tp->dev);
 }
 
 static void rtl_request_uncached_firmware(struct rtl8169_private *tp)
@@ -6773,6 +6771,7 @@ static void rtl8169_tx_clear(struct rtl8169_private *tp)
 {
 	rtl8169_tx_clear_range(tp, tp->dirty_tx, NUM_TX_DESC);
 	tp->cur_tx = tp->dirty_tx = 0;
+	netdev_reset_queue(tp->dev);
 }
 
 static void rtl_reset_work(struct rtl8169_private *tp)
@@ -7231,9 +7230,9 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
 		tx_left--;
 	}
 
-	if (tp->dirty_tx != dirty_tx) {
-		netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);
+	netdev_completed_queue(tp->dev, pkts_compl, bytes_compl);
 
+	if (tp->dirty_tx != dirty_tx) {
 		u64_stats_update_begin(&tp->tx_stats.syncp);
 		tp->tx_stats.packets += pkts_compl;
 		tp->tx_stats.bytes += bytes_compl;
@@ -7263,6 +7262,8 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
 
 			RTL_W8(TxPoll, NPQ);
 		}
+	} else {
+		WARN_ON_ONCE(pkts_compl);
 	}
 }
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ