[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ceef56b-9f7b-df36-17e4-1542d3306267@openvpn.net>
Date: Tue, 28 Jun 2022 17:49:08 -0600
From: James Yonan <james@...nvpn.net>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, therbert@...gle.com,
stephen@...workplumber.org
Subject: Re: [PATCH net-next v2] rfs: added /proc/sys/net/core/rps_allow_ooo
flag to tweak flow alg
On 6/28/22 11:03, Jakub Kicinski wrote:
> On Mon, 27 Jun 2022 23:17:54 -0600 James Yonan wrote:
>> rps_allow_ooo (0|1, default=0) -- if set to 1, allow RFS (receive flow
>> steering) to move a flow to a new CPU even if the old CPU queue has
>> pending packets. Note that this can result in packets being delivered
>> out-of-order. If set to 0 (the default), the previous behavior is
>> retained, where flows will not be moved as long as pending packets remain.
>>
>> The motivation for this patch is that while it's good to prevent
>> out-of-order packets, the current RFS logic requires that all previous
>> packets for the flow have been dequeued before an RFS CPU switch is made,
>> so as to preserve in-order delivery. In some cases, on links with heavy
>> VPN traffic, we have observed that this requirement is too onerous, and
>> that it prevents an RFS CPU switch from occurring within a reasonable time
>> frame if heavy traffic causes the old CPU queue to never fully drain.
>>
>> So rps_allow_ooo allows the user to select the tradeoff between a more
>> aggressive RFS steering policy that may reorder packets on a CPU switch
>> event (rps_allow_ooo=1) vs. one that prioritizes in-order delivery
>> (rps_allow_ooo=0).
> Can you give a practical example where someone would enable this?
> What is the traffic being served here that does not care about getting
> severely chopped up? Also why are you using RPS, it's 2022, don't all
> devices of note have multi-queue support?
So the problem with VPN transport is that you have encryption overhead
that can be CPU intensive. Suppose I can get 10 Gbps throughput per
core. Now suppose I have 4 different 10 Gbps sessions on my 4 core
machine. In a perfect world, each of those sessions would migrate to a
different core and you would achieve the full parallelism of your
hardware. RFS helps to make this work, but the existing RFS algorithm
sometimes gets stuck with multiple sessions on one core, while other
cores are idle. I found that this often occurs because RFS puts a high
priority on maintaining in-order delivery, so once the queues are
operating at full speed, it's very difficult to find an opportunity to
switch CPUs without some packet reordering. But the cost of being
strict about packet reordering is that you end up with multiple sessions
stuck on the same core, alongside idle cores. This is solved by setting
rps_allow_ooo to 1. You might get a few reordered packets on the CPU
switch event, but once the queues stabilize, you get significantly
higher throughput because you can actively load balance the sessions
across CPUs.
Re: why are we still using RPS/RFS in 2022? It's very useful for load
balancing L4 sessions across multiple CPUs (not only across multiple net
device queues).
James
Powered by blists - more mailing lists