lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 5 Jan 2018 05:04:30 -0600
From:   Ioana Radulescu <ruxandra.radulescu@....com>
To:     <gregkh@...uxfoundation.org>
CC:     <devel@...verdev.osuosl.org>, <linux-kernel@...r.kernel.org>,
        <roy.pledge@....com>, <laurentiu.tudor@....com>,
        <stuyoder@...il.com>, <bogdan.purcareata@....com>
Subject: [PATCH 0/2] staging: fsl-mc/dpio, fsl-dpaa2/eth: Use affine DPIO services

The first patch introduces a new DPIO API function allowing
the selection of a cpu-affine service. In the second patch,
the Ethernet driver uses this new API to map cpu-affine DPIO
services to channels.

This adds a significant performance boost on the frame dequeue
side (see numbers below), but, more importantly, helps us avoid
a scenario leading to the kernel running out of memory:

In an unidirectional IP forwarding test running on 8cores, with
ingress traffic @10Gbps, the rate with which Tx frames are
enqueued by the cpu on Tx queues is higher than the rate with
which cores manage to dequeue Tx confirmation frames. So these
confirmation frames increasingly accumulate on hardware queues,
eventually gobbling up all system memory.

Enforcing DPIO service affinity takes advantage of the way
our hardware was designed to be used, eliminating the bottleneck
that caused the imbalance between Rx and Tx processing paths.

Performance numbers measured on a LS2088A board with 8cores @2GHz:

Bidirectional IP forwarding, 512 flows with 64B frames:
1.68Mpps (before) vs 3.77Mpps (after)
Unidirectional IP forwarding, 512 flows with 64B frames:
1.33Mpps (and OOM after ~5-8min) vs 3.72Mpps
TCP netperf (client), 256 flows:
17.27Gbps with 100% cpu load vs 18.78Gbps with 77.6% cpu load

Ioana Radulescu (2):
  staging: fsl-mc/dpio: Add dpaa2_io_service_select() API
  staging: fsl-dpaa2/eth: Use affine DPIO services

 drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c | 24 ++++++++++++++----------
 drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.h |  1 +
 drivers/staging/fsl-mc/bus/dpio/dpio-service.c | 17 +++++++++++++++++
 drivers/staging/fsl-mc/include/dpaa2-io.h      |  2 ++
 4 files changed, 34 insertions(+), 10 deletions(-)

-- 
2.7.4

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ