lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 31 Aug 2016 19:09:13 -0700
From:   Ed Swierk <eswierk@...portsystems.com>
To:     Aaro Koskinen <aaro.koskinen@....fi>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        David Daney <ddaney@...iumnetworks.com>,
        devel@...verdev.osuosl.org
Cc:     linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 00/11] staging: octeon: multi rx group (queue) support

On 8/31/16 13:57, Aaro Koskinen wrote:
> This series implements multiple RX group support that should improve
> the networking performance on multi-core OCTEONs. Basically we register
> IRQ and NAPI for each group, and ask the HW to select the group for
> the incoming packets based on hash.
> 
> Tested on EdgeRouter Lite with a simple forwarding test using two flows
> and 16 RX groups distributed between two cores - the routing throughput
> is roughly doubled.
> 
> Also tested with EBH5600 (8 cores) and EBB6800 (16 cores) by sending
> and receiving traffic in both directions using SGMII interfaces.

With this series on 4.4.19, rx works with receive_group_order > 0.
Setting receive_group_order=4, I do see 16 Ethernet interrupts. I tried
fiddling with various smp_affinity values (e.g. setting them all to
ffffffff, or assigning a different one to each interrupt, or giving a
few to some and a few to others), as well as different values for
rps_cpus. 10-thread parallel iperf performance varies between 0.5 and 1.5
Gbit/sec total depending on the particular settings.

With the SDK kernel I get over 8 Gbit/sec. It seems to be achieving that
using just one interrupt (not even a separate one for tx, as far as I can
tell) pegged to CPU 0 (the default smp_affinity). I must be missing some
other major configuration tweak, perhaps specific to 10G.

Can you run a test on the EBB6800 with the interfaces in 10G mode?

--Ed

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ