lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 8 Aug 2023 13:33:02 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Wei Fang <wei.fang@....com>, Jesper Dangaard Brouer <hawk@...nel.org>,
 Jesper Dangaard Brouer <jbrouer@...hat.com>, Jakub Kicinski <kuba@...nel.org>
Cc: "davem@...emloft.net" <davem@...emloft.net>,
 "edumazet@...gle.com" <edumazet@...gle.com>,
 "pabeni@...hat.com" <pabeni@...hat.com>, Shenwei Wang
 <shenwei.wang@....com>, Clark Wang <xiaoning.wang@....com>,
 "ast@...nel.org" <ast@...nel.org>,
 "daniel@...earbox.net" <daniel@...earbox.net>,
 "john.fastabend@...il.com" <john.fastabend@...il.com>,
 "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
 dl-linux-imx <linux-imx@....com>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "bpf@...r.kernel.org" <bpf@...r.kernel.org>, Andrew Lunn <andrew@...n.ch>
Subject: Re: [PATCH V3 net-next] net: fec: add XDP_TX feature support



On 08/08/2023 07.02, Wei Fang wrote:
>>> For XDP_REDIRECT, the performance show as follow.
>>> root@...8mpevk:~# ./xdp_redirect eth1 eth0 Redirecting from eth1
>>> (ifindex 3; driver st_gmac) to eth0 (ifindex 2; driver fec)
>>
>> This is not exactly the same as XDP_TX setup as here you choose to redirect
>> between eth1 (driver st_gmac) and to eth0 (driver fec).
>>
>> I would like to see eth0 to eth0 XDP_REDIRECT, so we can compare to
>> XDP_TX performance.
>> Sorry for all the requests, but can you provide those numbers?
>>
> 
> Oh, sorry, I thought what you wanted were XDP_REDIRECT results for different
> NICs. Below is the result of XDP_REDIRECT on the same NIC.
> root@...8mpevk:~# ./xdp_redirect eth0 eth0
> Redirecting from eth0 (ifindex 2; driver fec) to eth0 (ifindex 2; driver fec)
> Summary        232,302 rx/s        0 err,drop/s      232,344 xmit/s
> Summary        234,579 rx/s        0 err,drop/s      234,577 xmit/s
> Summary        235,548 rx/s        0 err,drop/s      235,549 xmit/s
> Summary        234,704 rx/s        0 err,drop/s      234,703 xmit/s
> Summary        235,504 rx/s        0 err,drop/s      235,504 xmit/s
> Summary        235,223 rx/s        0 err,drop/s      235,224 xmit/s
> Summary        234,509 rx/s        0 err,drop/s      234,507 xmit/s
> Summary        235,481 rx/s        0 err,drop/s      235,482 xmit/s
> Summary        234,684 rx/s        0 err,drop/s      234,683 xmit/s
> Summary        235,520 rx/s        0 err,drop/s      235,520 xmit/s
> Summary        235,461 rx/s        0 err,drop/s      235,461 xmit/s
> Summary        234,627 rx/s        0 err,drop/s      234,627 xmit/s
> Summary        235,611 rx/s        0 err,drop/s      235,611 xmit/s
>    Packets received    : 3,053,753
>    Average packets/s   : 234,904
>    Packets transmitted : 3,053,792
>    Average transmit/s  : 234,907
>>
>> I'm puzzled that moving the MMIO write isn't change performance.
>>
>> Can you please verify that the packet generator machine is sending more
>> frame than the system can handle?
>>
>> (meaning the pktgen_sample03_burst_single_flow.sh script fast enough?)
>>
> 
> Thanks very much!
> You remind me, I always started the pktgen script first and then ran the xdp2
> program in the previous tests. So I saw the transmit speed of the generator
> was always greater than the speed of XDP_TX when I stopped the script. But
> actually, the real-time transmit speed of the generator was degraded to as
> equal to the speed of XDP_TX.
> 

Good that we finally found the root-cause, that explains why it seems
our code changes didn't have any effect.  The generator gets affected
and slowed down due to the traffic that is bounced back to it. (I tried
to hint this earlier with the Ethernet Flow-Control settings).

> So I turned off the rx function of the generator in case of increasing the CPU
> loading of the generator due to the returned traffic from xdp2. 

How did you turned off the rx function of the generator?
(I a couple of tricks I use)

> And I tested
> the performance again. Below are the results.
> 
> Result 1: current method
> root@...8mpevk:~# ./xdp2 eth0
> proto 17:     326539 pkt/s
> proto 17:     326464 pkt/s
> proto 17:     326528 pkt/s
> proto 17:     326465 pkt/s
> proto 17:     326550 pkt/s
> 
> Result 2: sync_dma_len method
> root@...8mpevk:~# ./xdp2 eth0
> proto 17:     353918 pkt/s
> proto 17:     352923 pkt/s
> proto 17:     353900 pkt/s
> proto 17:     352672 pkt/s
> proto 17:     353912 pkt/s
> 

This looks more promising:
  ((353912/326550)-1)*100 = 8.37% faster.

Or gaining/saving approx 236 nanosec per packet ((1/326550-1/353912)*10^9).

> Note: the speed of the generator is about 935397pps.
> 
> Compared result 1 with result 2. The "sync_dma_len" method actually improves
> the performance of XDP_TX, so the conclusion from the previous tests is *incorrect*.
> I'm so sorry for that. :(
> 

I'm happy that we finally found the root-cause.
Thanks for doing all the requested tests I asked for.

> In addition, I also tried the "dma_sync_len" + not use xdp_convert_buff_to_frame()
> method, the performance has been further improved. Below is the result.
> 
> Result 3: sync_dma_len + not use xdp_convert_buff_to_frame() method
> root@...8mpevk:~# ./xdp2 eth0
> proto 17:     369261 pkt/s
> proto 17:     369267 pkt/s
> proto 17:     369206 pkt/s
> proto 17:     369214 pkt/s
> proto 17:     369126 pkt/s
> 
> Therefore, I'm intend to use the "dma_sync_len"+ not use xdp_convert_buff_to_frame()
> method in the V5 patch. Thank you again, Jesper and Jakub. You really helped me a lot. :)
> 

I suggest, that V5 patch still use xdp_convert_buff_to_frame(), and then
you send followup patch (or as 2/2 patch) that remove the use of
xdp_convert_buff_to_frame() for XDP_TX.  This way it is easier to keep
track of the changes and improvements.

I would be very interested in knowing if the MMIO test change after this
correction to the testlab/generator.

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ