netdev - Re: [bug?] r8169: hangs under heavy load

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1322262357.2550.12.camel@edumazet-laptop>
Date:	Sat, 26 Nov 2011 00:05:57 +0100
From:	Eric Dumazet <eric.dumazet@...il.com>
To:	Francois Romieu <romieu@...zoreil.com>
Cc:	Jonathan Nieder <jrnieder@...il.com>, netdev@...r.kernel.org,
	nic_swsd@...ltek.com, linux-kernel@...r.kernel.org,
	Armin Kazmi <armin.kazmi@...dortmund.de>,
	Gerd <booster@...ke7.net>
Subject: Re: [bug?] r8169: hangs under heavy load

Le vendredi 25 novembre 2011 à 23:22 +0100, Francois Romieu a écrit :
> Eric Dumazet <eric.dumazet@...il.com> :
> [...]
> > rtl8169_rx_interrupt(..., budget) can return budget + 1 sometimes
> > because of :
> > 
> >                 /* Work around for AMD plateform. */
> >                 if ((desc->opts2 & cpu_to_le32(0xfffe000)) &&
> >                     (tp->mac_version == RTL_GIGA_MAC_VER_05)) {
> >                         desc->opts2 = 0;
> >                         cur_rx++;
> >                 }
> 
> It needs fixing but RTL_GIGA_MAC_VER_05 is an old PCI 8169sc while
> debian's bug #642911 is about a 8168c (aka RTL_GIGA_MAC_VER_{19 .. 22}).
> 
> This path is not used.
> 

OK, then we receive a RxFIFOOver indication while napi handler is
running (quite possible if machine under network load)

This (hard) interrupt calls rtl8169_tx_timeout()
	-> rtl8169_hw_reset()
		-> rtl_hw_reset()
			-> rtl8169_init_ring_indexes()

tp->dirty_tx = tp->dirty_rx = tp->cur_tx = tp->cur_rx = 0;

When control returns to softirq handler (rtl8169_rx_interrupt())
it can then catch tp->cur_rx being now 0 instead of value at start of
handler.

count = cur_rx - tp->cur_rx; // too big


Really, calling rtl8169_init_ring_indexes() from hardirq is killing us.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html