netdev - Re: [PATCH net-next] net: axienet: Use NAPI for TX completion path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Date:   Thu, 5 May 2022 20:15:26 +0000
From:   Robert Hancock <robert.hancock@...ian.com>
To:     "kuba@...nel.org" <kuba@...nel.org>
CC:     "harinik@...inx.com" <harinik@...inx.com>,
        "michals@...inx.com" <michals@...inx.com>,
        "pabeni@...hat.com" <pabeni@...hat.com>,
        "edumazet@...gle.com" <edumazet@...gle.com>,
        "linux-arm-kernel@...ts.infradead.org" 
        <linux-arm-kernel@...ts.infradead.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "radheys@...inx.com" <radheys@...inx.com>,
        "davem@...emloft.net" <davem@...emloft.net>
Subject: Re: [PATCH net-next] net: axienet: Use NAPI for TX completion path

On Thu, 2022-05-05 at 12:56 -0600, Robert Hancock wrote:
> On Thu, 2022-05-05 at 11:08 -0700, Jakub Kicinski wrote:
> > On Thu, 5 May 2022 17:33:39 +0000 Robert Hancock wrote:
> > > On Wed, 2022-05-04 at 19:20 -0700, Jakub Kicinski wrote:
> > > > On Mon, 2 May 2022 19:30:51 +0000 Radhey Shyam Pandey wrote:  
> > > > > Thanks for the patch. I assume for simulating heavy network load we
> > > > > are using netperf/iperf. Do we have some details on the benchmark
> > > > > before and after adding TX NAPI? I want to see the impact on
> > > > > throughput.  
> > > > 
> > > > Seems like a reasonable ask, let's get the patch reposted 
> > > > with the numbers in the commit message.  
> > > 
> > > Didn't mean to ignore that request, looks like I didn't get Radhey's
> > > email
> > > directly, odd.
> > > 
> > > I did a test with iperf3 from the board (Xilinx MPSoC ZU9EG platform)
> > > connected
> > > to a Linux PC via a switch at 1G link speed. With TX NAPI in place I saw
> > > about
> > > 942 Mbps for TX rate, with the previous code I saw 941 Mbps. RX speed was
> > > also
> > > unchanged at 941 Mbps. So no real significant change either way. I can
> > > spin
> > > another version of the patch that includes these numbers.
> > 
> > Sounds like line rate, is there a difference in CPU utilization?
> 
> Some measurements on that from the TX load case - in both cases the RX and TX
> IRQs ended up being split across CPU0 and CPU3 due to irqbalance:
> 
> Before:
> 
> CPU0 (RX): 1% hard IRQ, 13% soft IRQ
> CPU3 (TX): 12% hard IRQ, 30% soft IRQ
> 
> After:
> 
> CPU0 (RX): <1% hard IRQ, 29% soft IRQ
> CPU3 (TX): <1% hard IRQ, 21% soft IRQ
> 
> The hard IRQ time is definitely lower, and the total CPU usage is lower as
> well
> (56% down to 50%). It's interesting that so much of the CPU load ended up on
> the CPU with the RX IRQ though, presumably because the RX and TX IRQs are
> triggering the same NAPI poll operation. Since they're separate IRQs that can
> be on separate CPUs, it might be a win to use separate NAPI poll structures
> for RX and TX so that both CPUs aren't trying to hit the same rings (TX and
> RX)?

Indeed, it appears that separate RX and TX NAPI polling lowers the CPU usage
overall by a few percent as well as keeping the TX work on the same CPU as the
TX IRQ. I'll submit a v3 with these changes and will include the softirq
numbers in the commit text.

> 
-- 
Robert Hancock
Senior Hardware Designer, Calian Advanced Technologies
www.calian.com