[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b84863e7-5acc-697c-0e08-af88b691e678@openvpn.net>
Date: Mon, 11 Jul 2022 14:38:39 -0600
From: James Yonan <james@...nvpn.net>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, therbert@...gle.com,
stephen@...workplumber.org
Subject: Re: [PATCH net-next v2] rfs: added /proc/sys/net/core/rps_allow_ooo
flag to tweak flow alg
On 6/28/22 17:49, James Yonan wrote:
> On 6/28/22 11:03, Jakub Kicinski wrote:
>> On Mon, 27 Jun 2022 23:17:54 -0600 James Yonan wrote:
>>> rps_allow_ooo (0|1, default=0) -- if set to 1, allow RFS (receive flow
>>> steering) to move a flow to a new CPU even if the old CPU queue has
>>> pending packets. Note that this can result in packets being delivered
>>> out-of-order. If set to 0 (the default), the previous behavior is
>>> retained, where flows will not be moved as long as pending packets
>>> remain.
>>>
>>> The motivation for this patch is that while it's good to prevent
>>> out-of-order packets, the current RFS logic requires that all previous
>>> packets for the flow have been dequeued before an RFS CPU switch is
>>> made,
>>> so as to preserve in-order delivery. In some cases, on links with
>>> heavy
>>> VPN traffic, we have observed that this requirement is too onerous, and
>>> that it prevents an RFS CPU switch from occurring within a
>>> reasonable time
>>> frame if heavy traffic causes the old CPU queue to never fully drain.
>>>
>>> So rps_allow_ooo allows the user to select the tradeoff between a more
>>> aggressive RFS steering policy that may reorder packets on a CPU switch
>>> event (rps_allow_ooo=1) vs. one that prioritizes in-order delivery
>>> (rps_allow_ooo=0).
>> Can you give a practical example where someone would enable this?
>> What is the traffic being served here that does not care about getting
>> severely chopped up? Also why are you using RPS, it's 2022, don't all
>> devices of note have multi-queue support?
>
> So the problem with VPN transport is that you have encryption overhead
> that can be CPU intensive. Suppose I can get 10 Gbps throughput per
> core. Now suppose I have 4 different 10 Gbps sessions on my 4 core
> machine. In a perfect world, each of those sessions would migrate to
> a different core and you would achieve the full parallelism of your
> hardware. RFS helps to make this work, but the existing RFS algorithm
> sometimes gets stuck with multiple sessions on one core, while other
> cores are idle. I found that this often occurs because RFS puts a
> high priority on maintaining in-order delivery, so once the queues are
> operating at full speed, it's very difficult to find an opportunity to
> switch CPUs without some packet reordering. But the cost of being
> strict about packet reordering is that you end up with multiple
> sessions stuck on the same core, alongside idle cores. This is solved
> by setting rps_allow_ooo to 1. You might get a few reordered packets
> on the CPU switch event, but once the queues stabilize, you get
> significantly higher throughput because you can actively load balance
> the sessions across CPUs.
>
> Re: why are we still using RPS/RFS in 2022? It's very useful for load
> balancing L4 sessions across multiple CPUs (not only across multiple
> net device queues).
Any further questions/comments about this patch? The v2 patch
incorporates all feedback received so far, including refactoring a large
conditional in the original code to make it more readable and maintainable.
James
Powered by blists - more mailing lists