netdev - Re: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131119174323.GH913@1wt.eu>
Date:	Tue, 19 Nov 2013 18:43:23 +0100
From:	Willy Tarreau <w@....eu>
To:	Eric Dumazet <eric.dumazet@...il.com>
Cc:	Arnaud Ebalard <arno@...isbad.org>,
	Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
	netdev@...r.kernel.org, edumazet@...gle.com,
	Cong Wang <xiyou.wangcong@...il.com>,
	linux-arm-kernel@...ts.infradead.org,
	Florian Fainelli <f.fainelli@...il.com>,
	simon.guinot@...uanux.org
Subject: Re: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s

Hi Eric,

On Tue, Nov 19, 2013 at 05:53:14AM -0800, Eric Dumazet wrote:
> These strange results you have tend to show that if you have a big TCP
> send window, the web server pushes a lot of bytes per system call and
> might stall the ACK clocking or TX refills.

It's tx refills which are not done in this case from what I think I
understood in the driver. IIRC, the refill is done once at the beginning
of xmit and in the tx timer callback. So if you have too large a window
that fills the descriptors during a few tx calls during which no desc was
released, you could end up having to wait for the timer since you're not
allowed to send anymore.

> > Then, I started playing w/ tcp_limit_output_bytes (default is 131072),
> > w/ TCP send window set to 256KB:
> > 
> > tcp_limit_output_bytes set to 512KB: 59.3 MB/s
> > tcp_limit_output_bytes set to 256KB: 58.5 MB/s
> > tcp_limit_output_bytes set to 128KB: 56.2 MB/s
> > tcp_limit_output_bytes set to  64KB: 32.1 MB/s
> > tcp_limit_output_bytes set to  32KB: 4.76 MB/s
> > 
> > As a side note, during the test, I sometimes gets peak for some seconds
> > at the beginning at 90MB/s which tend to confirm what WIlly wrote,
> > i.e. that the hardware can do more.
> 
> I would also check the receiver. I suspect packets drops because of a
> bad driver doing skb->truesize overshooting.

When I first observed the issue, at first I suspected my laptop's driver
when I saw this problem, so I tried with a dockstar instead and the issue
disappeared... until I increased the tcp_rmem on it to match my laptop :-)

Arnaud, you might be interested in trying checking if the following change
does something for you in mvneta.c :

- #define MVNETA_TX_DONE_TIMER_PERIOD 10
+ #define MVNETA_TX_DONE_TIMER_PERIOD (1000/HZ)

This can only have any effect if you run at 250 or 1000 Hz, but not at 100
of course. It should reduce the time to first IRQ.

Willy

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html