lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220510185639.1c6d6c8a@kernel.org>
Date:   Tue, 10 May 2022 18:56:39 -0700
From:   Jakub Kicinski <kuba@...nel.org>
To:     Robert Hancock <robert.hancock@...ian.com>
Cc:     netdev@...r.kernel.org, radhey.shyam.pandey@...inx.com,
        davem@...emloft.net, edumazet@...gle.com, pabeni@...hat.com,
        michal.simek@...inx.com, linux-arm-kernel@...ts.infradead.org
Subject: Re: [PATCH net-next v5] net: axienet: Use NAPI for TX completion
 path

On Mon,  9 May 2022 11:30:39 -0600 Robert Hancock wrote:
> This driver was using the TX IRQ handler to perform all TX completion
> tasks. Under heavy TX network load, this can cause significant irqs-off
> latencies (found to be in the hundreds of microseconds using ftrace).
> This can cause other issues, such as overrunning serial UART FIFOs when
> using high baud rates with limited UART FIFO sizes.
> 
> Switch to using a NAPI poll handler to perform the TX completion work
> to get this out of hard IRQ context and avoid the IRQ latency impact.
> A separate poll handler is used for TX and RX since they have separate
> IRQs on this controller, so that the completion work for each of them
> stays on the same CPU as the interrupt.
> 
> Testing on a Xilinx MPSoC ZU9EG platform using iperf3 from a Linux PC
> through a switch at 1G link speed showed no significant change in TX or
> RX throughput, with approximately 941 Mbps before and after. Hard IRQ
> time in the TX throughput test was significantly reduced from 12% to
> below 1% on the CPU handling TX interrupts, with total hard+soft IRQ CPU
> usage dropping from about 56% down to 48%.
> 
> Signed-off-by: Robert Hancock <robert.hancock@...ian.com>
> ---
> 
> Changed since v4: Added locking to protect TX ring tail pointer against
> concurrent access by TX transmit and TX poll paths.

Hi, sorry for a late reply there's just too many patches to look at
lately.

The lock is slightly concerning, the driver follows the usual wake up 
scheme based on memory barriers. If we add the lock we should probably
take the barriers out.

We can also try to avoid the lock and drill into what the issue is.
At a quick look it seems like there is a barrier missing between setup
of the descriptors and kicking the transfer off:

diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
index d6fc3f7acdf0..9e244b73a0ca 100644
--- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
+++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c
@@ -878,10 +878,11 @@ axienet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
 	cur_p->skb = skb;
 
 	tail_p = lp->tx_bd_p + sizeof(*lp->tx_bd_v) * lp->tx_bd_tail;
-	/* Start the transfer */
-	axienet_dma_out_addr(lp, XAXIDMA_TX_TDESC_OFFSET, tail_p);
 	if (++lp->tx_bd_tail >= lp->tx_bd_num)
 		lp->tx_bd_tail = 0;
+	wmb(); // possibly dma_wmb()
+	/* Start the transfer */
+	axienet_dma_out_addr(lp, XAXIDMA_TX_TDESC_OFFSET, tail_p);
 
 	/* Stop queue if next transmit may not have space */
 	if (axienet_check_tx_bd_space(lp, MAX_SKB_FRAGS + 1)) {

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ