[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130812160142.71737a95@vostro>
Date: Mon, 12 Aug 2013 16:01:42 +0300
From: Timo Teras <timo.teras@....fi>
To: netdev@...r.kernel.org
Subject: ipsec smp scalability and cpu use fairness (softirqs)
Hi,
I've been recently doing some ipsec benchmarking, and analysis on
system running out of cpu power. The setup is dmvpn gateway
(gre+xfrm+opennhrp) with traffic in forward path.
The system I have been using are VIA Nano (Padlock aes/sha accel) and
Intel Xeon (aes-ni and ssse3 sha1) based. In both setups the crypto
happens synchronously using special opcodes, or assembly implementation
of the algorithm.
It seems that the combination of softirq, napi and synchronous crypto
causes two problems.
1. Single core systems that are going out of cpu power, are
overwhelmed in uncontrollable manner. As softirq is doing the heavy
lifting, the user land processes are starved first. This can cause
userland IKE daemon to starve and lose tunnels when it is unable to
answer liveliness checks. The quick workaround is to setup traffic
shaping for the encrypted traffic.
2. On multicore (6-12 cores) systems, it would appear that it is not
easy to distribute the ipsec to multiple cores. as softirq is sticky to
the cpu where it was raised. The ipsec decryption/encryption is done
synchronously in the napi poll loop, and the throughput is limited by
one cpu. If the NIC supports multiple queues and balancing with ESP
SPI, we can use that to get some parallelism.
Fundamentally, both problems arise because synchronous crypto happens in
the softirq context. I'm wondering if it would make sense to execute
the synchronous crypto in low-priority per-xfrm_state workqueue or
similar.
Any suggestions or comments?
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists