[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130813092312.2493354e@vostro>
Date: Tue, 13 Aug 2013 09:23:12 +0300
From: Timo Teras <timo.teras@....fi>
To: Andrew Collins <bsderandrew@...il.com>
Cc: netdev@...r.kernel.org
Subject: Re: ipsec smp scalability and cpu use fairness (softirqs)
On Mon, 12 Aug 2013 15:58:41 -0600
Andrew Collins <bsderandrew@...il.com> wrote:
> On Mon, Aug 12, 2013 at 7:01 AM, Timo Teras <timo.teras@....fi> wrote:
> > 1. Single core systems that are going out of cpu power, are
> > overwhelmed in uncontrollable manner. As softirq is doing the heavy
> > lifting, the user land processes are starved first. This can cause
> > userland IKE daemon to starve and lose tunnels when it is unable to
> > answer liveliness checks. The quick workaround is to setup traffic
> > shaping for the encrypted traffic.
>
> Which kernel version are you on? I've found I've had better behavior
> since:
>
> commit c10d73671ad30f54692f7f69f0e09e75d3a8926a
> Author: Eric Dumazet <edumazet@...gle.com>
> Date: Thu Jan 10 15:26:34 2013 -0800
>
> softirq: reduce latencies
>
> as it bails from lengthy softirq processing much earlier, along with
The user process starvation observations are originally from 3.3/3.4
kernels, and I have not retested properly yet with newer ones.
Currently starting upgrades to 3.10. That commit looks like it will
directly fix mostly single core starvation issues.
I think netdev_budget mostly affects latencies for other softirqs,
since the rx softirq will be practically always on during the stress.
And it can still cause problems that encrypted and non-encrypted
packets still go through same queues. This means that when we are out
cpu, we can start dropping even non-encrypted packets early.
> tuning "netdev_budget" to avoid cycling for too long in the NAPI poll.
>
> > 2. On multicore (6-12 cores) systems, it would appear that it is not
> > easy to distribute the ipsec to multiple cores. as softirq is
> > sticky to the cpu where it was raised. The ipsec
> > decryption/encryption is done synchronously in the napi poll loop,
> > and the throughput is limited by one cpu. If the NIC supports
> > multiple queues and balancing with ESP SPI, we can use that to get
> > some parallelism.
>
> Although it's highly usecase dependent, I've had good luck using
> RPS. I'm testing as an ipsec router however, not with an endpoint
> on the host itself, so it processes nearly all ipsec traffic in
> receive context.
Yes, RPS will help on many scenarios but not all. The flow dissector
knows only IP/TCP/UDP/GRE, but not ESP. So as long as traffic is
distributed between different IP-addresses, it is distributed. But if I
have lot of traffic between two nodes either with different ESP SPI
(different gatewayed subnets), or even with same SPI, then it won't.
For my scenario it will be usually even same SPI. So even if flow
dissector learns ESP and uses SPI in hash, I'd need a way to balance
traffic to multiple SAs.
I guess the place where I'd want to see the distribution to cores is
crypto_aead_*() calls. In fact, it seems there's code infracture
already for it: crypto/cryptd.c. Seems it needs to be manually
configured and only few places e.g. aesni gcm parts use it.
I'm wondering if it'd make sense to patch net/xfrm/xfrm_algo.c to use
cryptd? Or at least have a Kconfig or sysctl option make it do so.
- Timo
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Powered by blists - more mailing lists