[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211120083107.z2cm7tkl2rsri2v7@gmail.com>
Date: Sat, 20 Nov 2021 08:31:07 +0000
From: Martin Habets <habetsm.xilinx@...il.com>
To: Íñigo Huguet <ihuguet@...hat.com>
Cc: Edward Cree <ecree.xilinx@...il.com>, netdev@...r.kernel.org,
Dinan Gunawardena <dinang@...inx.com>
Subject: Re: Bad performance in RX with sfc 40G
On Thu, Nov 18, 2021 at 04:14:35PM +0100, Íñigo Huguet wrote:
> Hello,
>
> Doing some tests a few weeks ago I noticed a very low performance in
> RX using 40G Solarflare NICs. Doing tests with iperf3 I got more than
> 30Gbps in TX, but just around 15Gbps in RX. Other NICs from other
> vendors could send and receive over 30Gbps.
>
> I was doing the tests with multiple threads in iperf3 (-P 8).
>
> The models used are SFC9140 and SFC9220.
>
> Perf showed that most of the time was being expended in
> `native_queued_spin_lock_slowpath`. Tracing the calls to it with
> bpftrace I got that most of the calls were from __napi_poll > efx_poll
> > efx_fast_push_rx_descriptors > __alloc_pages >
> get_page_from_freelist > ...
>
> Please can you help me investigate the issue? At first sight, it seems
> a not very optimal memory allocation strategy, or maybe a failure in
> pages recycling strategy...
Apologies for the late reply. I'm having trouble digging out a
machine to test this.
If you're testing without the IOMMU enabled I suspect the recycle ring
size may be too small. Can your try the patch below?
Martin
diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c
index 68fc7d317693..e37702bf3380 100644
--- a/drivers/net/ethernet/sfc/rx_common.c
+++ b/drivers/net/ethernet/sfc/rx_common.c
@@ -28,7 +28,7 @@ MODULE_PARM_DESC(rx_refill_threshold,
* the number of pages to store in the RX page recycle ring.
*/
#define EFX_RECYCLE_RING_SIZE_IOMMU 4096
-#define EFX_RECYCLE_RING_SIZE_NOIOMMU (2 * EFX_RX_PREFERRED_BATCH)
+#define EFX_RECYCLE_RING_SIZE_NOIOMMU 1024
/* RX maximum head room required.
*
>
> This is the output of bpftrace, the 2 call chains that repeat more
> times, both from sfc
>
> @[
> native_queued_spin_lock_slowpath+1
> _raw_spin_lock+26
> rmqueue_bulk+76
> get_page_from_freelist+2295
> __alloc_pages+214
> efx_fast_push_rx_descriptors+640
> efx_poll+660
> __napi_poll+42
> net_rx_action+547
> __softirqentry_text_start+208
> __irq_exit_rcu+179
> common_interrupt+131
> asm_common_interrupt+30
> cpuidle_enter_state+199
> cpuidle_enter+41
> do_idle+462
> cpu_startup_entry+25
> start_kernel+2465
> secondary_startup_64_no_verify+194
> ]: 2650
> @[
> native_queued_spin_lock_slowpath+1
> _raw_spin_lock+26
> rmqueue_bulk+76
> get_page_from_freelist+2295
> __alloc_pages+214
> efx_fast_push_rx_descriptors+640
> efx_poll+660
> __napi_poll+42
> net_rx_action+547
> __softirqentry_text_start+208
> __irq_exit_rcu+179
> common_interrupt+131
> asm_common_interrupt+30
> cpuidle_enter_state+199
> cpuidle_enter+41
> do_idle+462
> cpu_startup_entry+25
> secondary_startup_64_no_verify+194
> ]: 17119
>
> --
> Íñigo Huguet
Powered by blists - more mailing lists