lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131121004430.GX8581@1wt.eu>
Date:	Thu, 21 Nov 2013 01:44:30 +0100
From:	Willy Tarreau <w@....eu>
To:	Arnaud Ebalard <arno@...isbad.org>
Cc:	Thomas Petazzoni <thomas.petazzoni@...e-electrons.com>,
	Florian Fainelli <f.fainelli@...il.com>,
	simon.guinot@...uanux.org, Eric Dumazet <eric.dumazet@...il.com>,
	netdev@...r.kernel.org, edumazet@...gle.com,
	Cong Wang <xiyou.wangcong@...il.com>,
	linux-arm-kernel@...ts.infradead.org
Subject: Re: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s

Hi Arnaud,

On Wed, Nov 20, 2013 at 10:54:35PM +0100, Willy Tarreau wrote:
> I'm currently trying to implement TX IRQ handling. I found the registers
> description in the neta driver that is provided in Marvell's LSP kernel
> that is shipped with some devices using their CPUs. This code is utterly
> broken (eg: splice fails with -EBADF) but I think the register descriptions
> could be trusted.
> 
> I'd rather have real IRQ handling than just relying on mvneta_poll(), so
> that we can use it for asymmetric traffic/routing/whatever.

OK it paid off. And very well :-)

I did it at once and it worked immediately. I generally don't like this
because I always fear that some bug was left there hidden in the code. I have
only tested it on the Mirabox, so I'll have to try on the OpenBlocks AX3-4 and
on the XP-GP board for some SMP stress tests.

I upgraded my Mirabox to latest Linus' git (commit 5527d151) and compared
with and without the patch.

  without :
      - need at least 12 streams to reach gigabit.
      - 60% of idle CPU remains at 1 Gbps
      - HTTP connection rate on empty objects is 9950 connections/s
      - cumulated outgoing traffic on two ports reaches 1.3 Gbps

  with the patch :
      - a single stream easily saturates the gigabit
      - 87% of idle CPU at 1 Gbps (12 streams, 90% idle at 1 stream)
      - HTTP connection rate on empty objects is 10250 connections/s
      - I saturate the two gig ports at 99% CPU, so 2 Gbps sustained output.

BTW I must say I was impressed to see that big an improvement in CPU
usage between 3.10 and 3.13, I suspect some of the Tx queue improvements
that Eric has done in between account for this.

I cut the patch in 3 parts :
   - one which reintroduces the hidden bits of the driver
   - one which replaces the timer with the IRQ
   - one which changes the default Tx coalesce from 16 to 4 packets
     (larger was preferred with the timer, but less is better now).

I'm attaching them, please test them on your device.

Note that this is *not* for inclusion at the moment as it has not been
tested on the SMP CPUs.

Cheers,
Willy


View attachment "0001-net-mvneta-add-missing-bit-descriptions-for-interrup.patch" of type "text/plain" (3902 bytes)

View attachment "0002-net-mvneta-replace-Tx-timer-with-a-real-interrupt.patch" of type "text/plain" (6447 bytes)

View attachment "0003-net-mvneta-reduce-Tx-coalesce-from-16-to-4-packets.patch" of type "text/plain" (1013 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ