lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YwRmzozIY4iqKTs2@unreal>
Date:   Tue, 23 Aug 2022 08:34:06 +0300
From:   Leon Romanovsky <leon@...nel.org>
To:     Jakub Kicinski <kuba@...nel.org>
Cc:     Steffen Klassert <steffen.klassert@...unet.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        "David S . Miller" <davem@...emloft.net>,
        Herbert Xu <herbert@...dor.apana.org.au>,
        netdev@...r.kernel.org, Raed Salem <raeds@...dia.com>,
        ipsec-devel <devel@...ux-ipsec.org>
Subject: Re: [PATCH xfrm-next v2 0/6] Extend XFRM core to allow full offload
 configuration

On Mon, Aug 22, 2022 at 09:33:04AM -0700, Jakub Kicinski wrote:
> On Mon, 22 Aug 2022 11:54:42 +0300 Leon Romanovsky wrote:
> > On Mon, Aug 22, 2022 at 10:41:05AM +0200, Steffen Klassert wrote:
> > > On Fri, Aug 19, 2022 at 10:53:56AM -0700, Jakub Kicinski wrote:  
> > > > Yup, that's what I thought you'd say. Can't argue with that use case 
> > > > if Steffen is satisfied with the technical aspects.  
> > > 
> > > Yes, everything that can help to overcome the performance problems
> > > can help and I'm interested in this type of offload. But we need to
> > > make sure the API is usable by the whole community, so I don't
> > > want an API for some special case one of the NIC vendors is
> > > interested in.  
> > 
> > BTW, we have a performance data, I planned to send it as part of cover
> > letter for v3, but it is worth to share it now.
> > 
> >  ================================================================================
> >  Performance results:
> > 
> >  TCP multi-stream, using iperf3 instance per-CPU.
> >  +----------------------+--------+--------+--------+--------+---------+---------+
> >  |                      | 1 CPU  | 2 CPUs | 4 CPUs | 8 CPUs | 16 CPUs | 32 CPUs |
> >  |                      +--------+--------+--------+--------+---------+---------+
> >  |                      |                   BW (Gbps)                           |
> >  +----------------------+--------+--------+-------+---------+---------+---------+
> >  | Baseline             | 27.9   | 59     | 93.1  | 92.8    | 93.7    | 94.4    |
> >  +----------------------+--------+--------+-------+---------+---------+---------+
> >  | Software IPsec       | 6      | 11.9   | 23.3  | 45.9    | 83.8    | 91.8    |
> >  +----------------------+--------+--------+-------+---------+---------+---------+
> >  | IPsec crypto offload | 15     | 29.7   | 58.5  | 89.6    | 90.4    | 90.8    |
> >  +----------------------+--------+--------+-------+---------+---------+---------+
> >  | IPsec full offload   | 28     | 57     | 90.7  | 91      | 91.3    | 91.9    |
> >  +----------------------+--------+--------+-------+---------+---------+---------+
> > 
> >  IPsec full offload mode behaves as baseline and reaches linerate with same amount
> >  of CPUs.
> > 
> >  Setups details (similar for both sides):
> >  * NIC: ConnectX6-DX dual port, 100 Gbps each.
> >    Single port used in the tests.
> >  * CPU: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz
> 
> My questions about performance were more about where does 
> the performance loss originate. Is it because of loss of GRO?
> Maybe sharing perf traces could answer some of those questions?

Crypto mode doesn't scale good in terms of CPUs.

CPU load data:
 * Remind that this is 160 CPUs machine with 2 threads per-core

Baseline:
PROCESSES  TOTAL_BW  HOST_LOCAL_CPU  HOST_REMOTE_CPU
1	   27.95     0.6	     1.1
2	   58.99     1	             2
4	   93.05     1.3	     3.2
8	   92.75     2	             3.4
16	   93.74     2.2	     4
32	   94.37     2.6	     4.5

IPsec crypto:
PROCESSES  TOTAL_BW  HOST_LOCAL_CPU  HOST_REMOTE_CPU
1	   15.04	  0.7		  1.2
2	   29.68	  1.2		  2.1
4	   58.52	  2		  3.9
8	   89.58	  2.8		  5.1
16	   90.42	  3.1		  7.1
32	   90.81	  3.16		  6.9

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ