[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d36c51ae-1832-effa-ee5c-fdebdeec5c5c@redhat.com>
Date: Thu, 9 Sep 2021 13:49:55 +0200
From: Jesper Dangaard Brouer <jbrouer@...hat.com>
To: Íñigo Huguet <ihuguet@...hat.com>,
ecree.xilinx@...il.com, habetsm.xilinx@...il.com,
davem@...emloft.net, kuba@...nel.org,
Ivan Babrou <ivan@...udflare.com>,
Marek Majkowski <marek@...udflare.com>,
Jakub Sitnicki <jakub@...udflare.com>,
Toke Hoiland Jorgensen <toke@...hat.com>,
Freysteinn Alfredsson <Freysteinn.Alfredsson@....se>
Cc: brouer@...hat.com, ast@...nel.org, daniel@...earbox.net,
hawk@...nel.org, john.fastabend@...il.com, netdev@...r.kernel.org,
linux-kernel@...r.kernel.org, bpf@...r.kernel.org
Subject: Re: [PATCH net 0/2] sfc: fallback for lack of xdp tx queues
Great work Huguet, patches LGTM, I would ACK but they have already been
applied:
Here is the summary with links:
- [net,1/2] sfc: fallback for lack of xdp tx queues
https://git.kernel.org/netdev/net/c/415446185b93
- [net,2/2] sfc: last resort fallback for lack of xdp tx queues
https://git.kernel.org/netdev/net/c/6215b608a8c4
Cloudflare (cc) heads-up for these improvements.
And heads-up to Toke and Frey on patch 2/2, as it creates push-back via
TX queue stop/restart logic (see kernel API netif_tx_queue_stopped).
XDP currently doesn't handle this well, but I hope to see XDP queueing
work from your side to improve the situation ;-)
On 09/09/2021 11.28, Íñigo Huguet wrote:
> If there are not enough hardware resources to allocate one tx queue per
> CPU for XDP, XDP_TX and XDP_REDIRECT actions were unavailable, and using
> them resulted each time with the packet being drop and this message in
> the logs: XDP TX failed (-22)
>
> These patches implement 2 fallback solutions for 2 different situations
> that might happen:
> 1. There are not enough free resources for all the tx queues, but there
> are some free resources available
> 2. There are not enough free resources at all for tx queues.
>
> Both solutions are based in sharing tx queues, using __netif_tx_lock for
> synchronization. In the second case, as there are not XDP TX queues to
> share, network stack queues are used instead, but since we're taking
> __netif_tx_lock, concurrent access to the queues is correctly protected.
>
> The solution for this second case might affect performance both of XDP
> traffic and normal traffice due to lock contention if both are used
> intensively. That's why I call it a "last resort" fallback: it's not a
> desirable situation, but at least we have XDP TX working.
>
> Some tests has shown good results and indicate that the non-fallback
> case is not being damaged by this changes. They are also promising for
> the fallback cases. This is the test:
> 1. From another machine, send high amount of packets with pktgen, script
> samples/pktgen/pktgen_sample04_many_flows.sh
> 2. In the tested machine, run samples/bpf/xdp_rxq_info with arguments
> "-a XDP_TX --swapmac" and see the results
> 3. In the tested machine, run also pktgen_sample04 to create high TX
> normal traffic, and see how xdp_rxq_info results vary
>
> Note that this test doesn't check the worst situations for the fallback
> solutions because XDP_TX will only be executed from the same CPUs that
> are processed by sfc, and not from every CPU in the system, so the
> performance drop due to the highest locking contention doesn't happen.
> I'd like to test that, as well, but I don't have access right now to a
> proper environment.
>
> Test results:
>
> Without doing TX:
> Before changes: ~2,900,000 pps
> After changes, 1 queues/core: ~2,900,000 pps
> After changes, 2 queues/core: ~2,900,000 pps
> After changes, 8 queues/core: ~2,900,000 pps
> After changes, borrowing from network stack: ~2,900,000 pps
>
> With multiflow TX at the same time:
> Before changes: ~1,700,000 - 2,900,000 pps
> After changes, 1 queues/core: ~1,700,000 - 2,900,000 pps
> After changes, 2 queues/core: ~1,700,000 pps
> After changes, 8 queues/core: ~1,700,000 pps
> After changes, borrowing from network stack: 1,150,000 pps
>
> Sporadic "XDP TX failed (-5)" warnings are shown when running xdp program
> and pktgen simultaneously. This was expected because XDP doesn't have any
> buffering system if the NIC is under very high pressure. Thousands of
> these warnings are shown in the case of borrowing net stack queues. As I
> said before, this was also expected.
>
>
> Íñigo Huguet (2):
> sfc: fallback for lack of xdp tx queues
> sfc: last resort fallback for lack of xdp tx queues
>
> drivers/net/ethernet/sfc/efx_channels.c | 98 ++++++++++++++++++-------
> drivers/net/ethernet/sfc/net_driver.h | 8 ++
> drivers/net/ethernet/sfc/tx.c | 29 ++++++--
> 3 files changed, 99 insertions(+), 36 deletions(-)
>
Powered by blists - more mailing lists