lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 5 Jul 2017 09:01:11 +0000
From:   Ilan Tayari <ilant@...lanox.com>
To:     Ilan Tayari <ilant@...lanox.com>, Florian Westphal <fw@...len.de>
CC:     Yossi Kuperman <yossiku@...lanox.com>,
        Steffen Klassert <steffen.klassert@...unet.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>
Subject: RE: [RFC net-next 9/9] xfrm: add a small xdst pcpu cache

> -----Original Message-----
> From: netdev-owner@...r.kernel.org [mailto:netdev-owner@...r.kernel.org]
> Subject: RE: [RFC net-next 9/9] xfrm: add a small xdst pcpu cache
> 
> > -----Original Message-----
> > From: netdev-owner@...r.kernel.org [mailto:netdev-owner@...r.kernel.org]
> > Subject: [RFC net-next 9/9] xfrm: add a small xdst pcpu cache
> >
> > retain last used xfrm_dst in a pcpu cache.
> > On next request, reuse this dst if the policies are the same.
> >
> > The cache does'nt help at all with strictly-RR workloads as
> > we never have a hit.
> >
> > Also, the cache adds cost of this_cpu_xchg() in packet path.
> > It would be better to use plain this_cpu_read/write, however,
> > a netdev notifier can run in parallel on other cpu and write same
> > pcpu value so the xchg is needed to avoid race.
> >
> > The notifier is needed so we do not add long hangs when a device
> > is dismantled but some pcpu xdst still holds a reference.
> >
> > Test results using 4 network namespaces and null encryption:
> >
> > ns1           ns2          -> ns3           -> ns4
> > netperf -> xfrm/null enc   -> xfrm/null dec -> netserver
> >
> > what                    TCP_STREAM      UDP_STREAM      UDP_RR
> > Flow cache:		14804.4		279.738		326213.0
> > No flow cache:		14158.3		257.458		228486.8
> > Pcpu cache:		14766.4		286.958		239433.5
> >
> > UDP tests used 64byte packets, tests ran for one minute each,
> > value is average over ten iterations.
> 
> Hi Florian,
> 
> I want to give this a go with hw-offload and see the impact on
> performance.
> It may take us a few days to do that.

Hi Florian,

We tested with and without your patchset, using single SA with hw-crypto
offload (RFC4106) IPv4 ESP tunnel mode, and a single netperf TCP_STREAM
with a few different messages Sizes.

We didn't separate the pcpu cache patch from the rest of the patchset.

Here are the findings:

What         64-byte    512-byte  1024-byte  1500-byte
Flow cache   1602.89    11004.97   14634.46   14577.60
Pcpu cache   1513.38    10862.55   14246.94   14231.07

The overall degradation seems a bit more than what you measured with
null-crypto.
We used two machines and no namespaces.

Ilan.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ