lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Mon, 11 Jul 2022 14:38:39 -0600 From: James Yonan <james@...nvpn.net> To: Jakub Kicinski <kuba@...nel.org> Cc: netdev@...r.kernel.org, therbert@...gle.com, stephen@...workplumber.org Subject: Re: [PATCH net-next v2] rfs: added /proc/sys/net/core/rps_allow_ooo flag to tweak flow alg On 6/28/22 17:49, James Yonan wrote: > On 6/28/22 11:03, Jakub Kicinski wrote: >> On Mon, 27 Jun 2022 23:17:54 -0600 James Yonan wrote: >>> rps_allow_ooo (0|1, default=0) -- if set to 1, allow RFS (receive flow >>> steering) to move a flow to a new CPU even if the old CPU queue has >>> pending packets. Note that this can result in packets being delivered >>> out-of-order. If set to 0 (the default), the previous behavior is >>> retained, where flows will not be moved as long as pending packets >>> remain. >>> >>> The motivation for this patch is that while it's good to prevent >>> out-of-order packets, the current RFS logic requires that all previous >>> packets for the flow have been dequeued before an RFS CPU switch is >>> made, >>> so as to preserve in-order delivery. In some cases, on links with >>> heavy >>> VPN traffic, we have observed that this requirement is too onerous, and >>> that it prevents an RFS CPU switch from occurring within a >>> reasonable time >>> frame if heavy traffic causes the old CPU queue to never fully drain. >>> >>> So rps_allow_ooo allows the user to select the tradeoff between a more >>> aggressive RFS steering policy that may reorder packets on a CPU switch >>> event (rps_allow_ooo=1) vs. one that prioritizes in-order delivery >>> (rps_allow_ooo=0). >> Can you give a practical example where someone would enable this? >> What is the traffic being served here that does not care about getting >> severely chopped up? Also why are you using RPS, it's 2022, don't all >> devices of note have multi-queue support? > > So the problem with VPN transport is that you have encryption overhead > that can be CPU intensive. Suppose I can get 10 Gbps throughput per > core. Now suppose I have 4 different 10 Gbps sessions on my 4 core > machine. In a perfect world, each of those sessions would migrate to > a different core and you would achieve the full parallelism of your > hardware. RFS helps to make this work, but the existing RFS algorithm > sometimes gets stuck with multiple sessions on one core, while other > cores are idle. I found that this often occurs because RFS puts a > high priority on maintaining in-order delivery, so once the queues are > operating at full speed, it's very difficult to find an opportunity to > switch CPUs without some packet reordering. But the cost of being > strict about packet reordering is that you end up with multiple > sessions stuck on the same core, alongside idle cores. This is solved > by setting rps_allow_ooo to 1. You might get a few reordered packets > on the CPU switch event, but once the queues stabilize, you get > significantly higher throughput because you can actively load balance > the sessions across CPUs. > > Re: why are we still using RPS/RFS in 2022? It's very useful for load > balancing L4 sessions across multiple CPUs (not only across multiple > net device queues). Any further questions/comments about this patch? The v2 patch incorporates all feedback received so far, including refactoring a large conditional in the original code to make it more readable and maintainable. James
Powered by blists - more mailing lists