netdev - Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <51c6e099-b915-4597-9f5a-3c51b1a4e2c6@intel.com>
Date: Thu, 5 Dec 2024 11:38:11 +0100
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Daniel Xu <dxu@...uu.xyz>, Jakub Kicinski <kuba@...nel.org>
CC: Lorenzo Bianconi <lorenzo.bianconi@...hat.com>, Lorenzo Bianconi
	<lorenzo@...nel.org>, "bpf@...r.kernel.org" <bpf@...r.kernel.org>, "Alexei
 Starovoitov" <ast@...nel.org>, Daniel Borkmann <daniel@...earbox.net>,
	"Andrii Nakryiko" <andrii@...nel.org>, John Fastabend
	<john.fastabend@...il.com>, Jesper Dangaard Brouer <hawk@...nel.org>, Martin
 KaFai Lau <martin.lau@...ux.dev>, David Miller <davem@...emloft.net>, Eric
 Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	<netdev@...r.kernel.org>
Subject: Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase

From: Daniel Xu <dxu@...uu.xyz>
Date: Wed, 04 Dec 2024 13:51:08 -0800

> 
> 
> On Wed, Dec 4, 2024, at 8:42 AM, Alexander Lobakin wrote:
>> From: Jakub Kicinski <kuba@...nel.org>
>> Date: Tue, 3 Dec 2024 16:51:57 -0800
>>
>>> On Tue, 3 Dec 2024 12:01:16 +0100 Alexander Lobakin wrote:
>>>>>> @ Jakub,  
>>>>>
>>>>> Context? What doesn't work and why?  
>>>>
>>>> My tests show the same perf as on Lorenzo's series, but I test with UDP
>>>> trafficgen. Daniel tests TCP and the results are much worse than with
>>>> Lorenzo's implementation.
>>>> I suspect this is related to that how NAPI performs flushes / decides
>>>> whether to repoll again or exit vs how kthread does that (even though I
>>>> also try to flush only every 64 frames or when the ring is empty). Or
>>>> maybe to that part of the kthread happens in process context outside any
>>>> softirq, while when using NAPI, the whole loop is inside RX softirq.
>>>>
>>>> Jesper said that he'd like to see cpumap still using own kthread, so
>>>> that its priority can be boosted separately from the backlog. That's why
>>>> we asked you whether it would be fine to have cpumap as threaded NAPI in
>>>> regards to all this :D
>>>
>>> Certainly not without a clear understanding what the problem with 
>>> a kthread is.
>>
>> Yes, sure thing.
>>
>> Bad thing's that I can't reproduce Daniel's problem >_< Previously, I
>> was testing with the UDP trafficgen and got up to 80% improvement over
>> the baseline. Now I tested TCP and got up to 70% improvement, no
>> regressions whatsoever =\
>>
>> I don't know where this regression on Daniel's setup comes from. Is it
>> multi-thread or single-thread test? 
> 
> 8 threads with 16 flows over them (-T8 -F16)
> 
>> What app do you use: iperf, netperf,
>> neper, Microsoft's app (forgot the name)?
> 
> neper, tcp_stream.

Let me recheck with neper -T8 -F16, I'll post my results soon.

> 
>> Do you have multiple NUMA
>> nodes on your system, are you sure you didn't cross the node when
>> redirecting with the GRO patches / no other NUMA mismatches happened?
> 
> Single node. Technically EPYC NPS=1. So there are some numa characteristics
> but I think the interconnect is supposed to hide it fairly efficiently.
> 
>> Some other random stuff like RSS hash key, which affects flow steering?
> 
> Whatever the default is - I'd be willing to be Kuba set up the configuration
> at one point or another so it's probably sane. And with 5 runs it seems
> unlikely the hashing would get unlucky and cause an imbalance.
> 
>>
>> Thanks,
>> Olek
> 
> Since I've got the setup handy and am motivated to see this work land,
> do you have any other pointers for things I should look for? I'll spend some
> time looking at profiles to see if I can identify any hot spots compared to
> softirq based GRO.
> 
> Thanks,
> Daniel

Thanks for helping with this!
Olek