lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <b319014e-519c-4c2d-8b6d-1632357e66cd@app.fastmail.com>
Date: Wed, 13 Nov 2024 15:39:13 -0800
From: "Daniel Xu" <dxu@...uu.xyz>
To: "Alexander Lobakin" <aleksander.lobakin@...el.com>
Cc: "Lorenzo Bianconi" <lorenzo@...nel.org>,
 "bpf@...r.kernel.org" <bpf@...r.kernel.org>,
 "Jakub Kicinski" <kuba@...nel.org>, "Alexei Starovoitov" <ast@...nel.org>,
 "Daniel Borkmann" <daniel@...earbox.net>,
 "Andrii Nakryiko" <andrii@...nel.org>,
 "John Fastabend" <john.fastabend@...il.com>,
 "Jesper Dangaard Brouer" <hawk@...nel.org>,
 "Martin KaFai Lau" <martin.lau@...ux.dev>,
 "David Miller" <davem@...emloft.net>, "Eric Dumazet" <edumazet@...gle.com>,
 "Paolo Abeni" <pabeni@...hat.com>, netdev@...r.kernel.org,
 "Lorenzo Bianconi" <lorenzo.bianconi@...hat.com>
Subject: Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase



On Tue, Nov 12, 2024, at 9:43 AM, Alexander Lobakin wrote:
> From: Alexander Lobakin <aleksander.lobakin@...el.com>
> Date: Tue, 22 Oct 2024 17:51:43 +0200
>
>> From: Alexander Lobakin <aleksander.lobakin@...el.com>
>> Date: Wed, 9 Oct 2024 14:50:42 +0200
>> 
>>> From: Lorenzo Bianconi <lorenzo@...nel.org>
>>> Date: Wed, 9 Oct 2024 14:47:58 +0200
>>>
>>>>> From: Lorenzo Bianconi <lorenzo@...nel.org>
>>>>> Date: Wed, 9 Oct 2024 12:46:00 +0200
>>>>>
>>>>>>> Hi Lorenzo,
>>>>>>>
>>>>>>> On Mon, Sep 16, 2024 at 12:13:42PM GMT, Lorenzo Bianconi wrote:
>>>>>>>> Add GRO support to cpumap codebase moving the cpu_map_entry kthread to a
>>>>>>>> NAPI-kthread pinned on the selected cpu.
>>>>>>>>
>>>>>>>> Changes in rfc v2:
>>>>>>>> - get rid of dummy netdev dependency
>>>>>>>>
>>>>>>>> Lorenzo Bianconi (3):
>>>>>>>>   net: Add napi_init_for_gro routine
>>>>>>>>   net: add napi_threaded_poll to netdevice.h
>>>>>>>>   bpf: cpumap: Add gro support
>>>>>>>>
>>>>>>>>  include/linux/netdevice.h |   3 +
>>>>>>>>  kernel/bpf/cpumap.c       | 123 ++++++++++++++++----------------------
>>>>>>>>  net/core/dev.c            |  27 ++++++---
>>>>>>>>  3 files changed, 73 insertions(+), 80 deletions(-)
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> 2.46.0
>>>>>>>>
>>>>>>>
>>>>>>> Sorry about the long delay - finally caught up to everything after
>>>>>>> conferences.
>>>>>>>
>>>>>>> I re-ran my synthetic tests (including baseline). v2 is somehow showing
>>>>>>> 2x bigger gains than v1 (~30% vs ~14%) for tcp_stream. Again, the only
>>>>>>> variable I changed is kernel version - steering prog is active for both.
>>>>>>>
>>>>>>>
>>>>>>> Baseline (again)							
>>>>>>>
>>>>>>> ./tcp_rr -c -H $TASK_IP -p 50,90,99 -T4 -F8 -l30			        ./tcp_stream -c -H $TASK_IP -T8 -F16 -l30
>>>>>>> 							
>>>>>>> 	Transactions	Latency P50 (s)	Latency P90 (s)	Latency P99 (s)			Throughput (Mbit/s)
>>>>>>> Run 1	2560252	        0.00009087	0.00010495	0.00011647		Run 1	15479.31
>>>>>>> Run 2	2665517	        0.00008575	0.00010239	0.00013311		Run 2	15162.48
>>>>>>> Run 3	2755939	        0.00008191	0.00010367	0.00012287		Run 3	14709.04
>>>>>>> Run 4	2595680	        0.00008575	0.00011263	0.00012671		Run 4	15373.06
>>>>>>> Run 5	2841865	        0.00007999	0.00009471	0.00012799		Run 5	15234.91
>>>>>>> Average	2683850.6	0.000084854	0.00010367	0.00012543		Average	15191.76
>>>>>>> 							
>>>>>>> cpumap NAPI patches v2							
>>>>>>> 							
>>>>>>> 	Transactions	Latency P50 (s)	Latency P90 (s)	Latency P99 (s)			Throughput (Mbit/s)
>>>>>>> Run 1	2577838	        0.00008575	0.00012031	0.00013695		Run 1	19914.56
>>>>>>> Run 2	2729237	        0.00007551	0.00013311	0.00017663		Run 2	20140.92
>>>>>>> Run 3	2689442	        0.00008319	0.00010495	0.00013311		Run 3	19887.48
>>>>>>> Run 4	2862366	        0.00008127	0.00009471	0.00010623		Run 4	19374.49
>>>>>>> Run 5	2700538	        0.00008319	0.00010367	0.00012799		Run 5	19784.49
>>>>>>> Average	2711884.2	0.000081782	0.00011135	0.000136182		Average	19820.388
>>>>>>> Delta	1.04%	        -3.62%	        7.41%	        8.57%			        30.47%
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Daniel
>>>>>>
>>>>>> Hi Daniel,
>>>>>>
>>>>>> cool, thx for testing it.
>>>>>>
>>>>>> @Olek: how do we want to proceed on it? Are you still working on it or do you want me
>>>>>> to send a regular patch for it?
>>>>>
>>>>> Hi,
>>>>>
>>>>> I had a small vacation, sorry. I'm starting working on it again today.
>>>>
>>>> ack, no worries. Are you going to rebase the other patches on top of it
>>>> or are you going to try a different approach?
>>>
>>> I'll try the approach without NAPI as Kuba asks and let Daniel test it,
>>> then we'll see.
>> 
>> For now, I have the same results without NAPI as with your series, so
>> I'll push it soon and let Daniel test.
>> 
>> (I simply decoupled GRO and NAPI and used the former in cpumap, but the
>>  kthread logic didn't change)
>> 
>>>
>>> BTW I'm curious how he got this boost on v2, from what I see you didn't
>>> change the implementation that much?
>
> Hi Daniel,
>
> Sorry for the delay. Please test [0].
>
> [0] https://github.com/alobakin/linux/commits/cpumap-old
>
> Thanks,
> Olek

Ack. Will do probably early next week.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ