lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20081201071614.GP476@secunet.com>
Date:	Mon, 1 Dec 2008 08:16:14 +0100
From:	Steffen Klassert <steffen.klassert@...unet.com>
To:	netdev@...r.kernel.org
Cc:	davem@...emloft.net, herbert@...dor.apana.org.au,
	klassert@...hematik.tu-chemnitz.de
Subject: [RFC PATCH 0/5] IPsec parallelization

This is a first throw to try to parallelize the expensive part of xfrm by
using a generic parallelization/serialization method. This method uses the
remote softirq invocation infrastructure for parallelization and serialization.
With this method data objects can be processed in parallel, starting 
at some given point. After doing some expensive operations in parallel, 
it is possible to serialize again. The parallelized data objects return after
serialization in the order as they were before the parallelization. 
In the case of xfrm, this makes it possible to run the expensive part in
parallel without getting packet reordering.
 
To use this parallelization method for xfrm, some changes in the crypto system
were necessary. First of all, we need to force disabling async crypto transforms
in the parallelization case, because we can't guarantee the packet order if
the packets are put to a queue during the parallel processing.
A second thing was a very high contended lock in crypto_authenc_hash() if
the crypto system runs in parallel. To get rid of this, the struct aead is
moved to percpu data, what in turn means that we have percpu IV chains now.
However, I'm not that familiar with the crypto system. So I'm not sure whether
this is acceptable as I did it, this needs review.

I did forwarding tests with two quad core machines (Intel Core 2 Quad Q6600) 
used as IPsec routers (xfrm tunnel between the two quad core machines) and two
notebooks T61 used as traffic generators.
With this testing environment I'm geting a throughput up to 910 Mbit/s (ipv4)
and 880 Mbit/s (ipv6) with aes192-sha1 encryption (measured with iperf,
_one_ tcp stream). Without the parallelization I'm getting with the same
environment about 340 Mbit/s (ipv4) and 320 Mbit/s (ipv6).

If somebody wants to test it, the parallelization is switched off by default.
To enable it, do 'echo 1 > /proc/sys/net/core/xfrm_padata'.

Steffen
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ