lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BYAPR11MB30957D871AF159CB7BB7F753D96C9@BYAPR11MB3095.namprd11.prod.outlook.com>
Date:   Mon, 15 Mar 2021 20:04:36 +0000
From:   "Chen, Mike Ximing" <mike.ximing.chen@...el.com>
To:     "Williams, Dan J" <dan.j.williams@...el.com>
CC:     Greg KH <gregkh@...uxfoundation.org>,
        Netdev <netdev@...r.kernel.org>,
        David Miller <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        "Arnd Bergmann" <arnd@...db.de>,
        Pierre-Louis Bossart <pierre-louis.bossart@...ux.intel.com>,
        "Brandeburg, Jesse" <jesse.brandeburg@...el.com>
Subject: RE: [PATCH v10 00/20] dlb: introduce DLB device driver

> From: Dan Williams <dan.j.williams@...el.com>
> On Fri, Mar 12, 2021 at 1:55 PM Chen, Mike Ximing <mike.ximing.chen@...el.com> wrote:
> >
> > In a brief description, Intel DLB is an accelerator that replaces
> > shared memory queuing systems. Large modern server-class CPUs,  with
> > local caches for each core, tend to incur costly cache misses, cross
> > core snoops and contentions.  The impact becomes noticeable at high
> > (messages/sec) rates, such as are seen in high throughput packet
> > processing and HPC applications. DLB is used in high rate pipelines
> > that require a variety of packet distribution & synchronization
> > schemes.  It can be leveraged to accelerate user space libraries, such
> > as DPDK eventdev. It could show similar benefits in frameworks such as
> > PADATA in the Kernel - if the messaging rate is sufficiently high.
> 
> Where is PADATA limited by distribution and synchronization overhead?
> It's meant for parallelizable work that has minimal communication between the work units, ordering is
> about it's only synchronization overhead, not messaging. It's used for ipsec crypto and page init.
> Even potential future bulk work usages that might benefit from PADATA like like md-raid, ksm, or kcopyd
> do not have any messaging overhead.
> 
In the our PADATA investigation, the improvements are primarily from ordering overhead.
Parallel scheduling is offloaded by DLB orderd parallel queue.
Serialization (re-order) is offloaded by DLB directed queue.
We see significant throughput increases in crypto tests using tcrypt.  In our test configuration, preliminary results show that the dlb accelerated case encrypts at 2.4x (packets/s), and decrypts at 2.6x of the unaccelerated case.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ