[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <01dcfecc-ab8e-43b8-b20c-96cc476a826d@intel.com>
Date: Tue, 12 Nov 2024 18:43:22 +0100
From: Alexander Lobakin <aleksander.lobakin@...el.com>
To: Daniel Xu <dxu@...uu.xyz>
CC: Lorenzo Bianconi <lorenzo@...nel.org>, <bpf@...r.kernel.org>,
<kuba@...nel.org>, <ast@...nel.org>, <daniel@...earbox.net>,
<andrii@...nel.org>, <john.fastabend@...il.com>, <hawk@...nel.org>,
<martin.lau@...ux.dev>, <davem@...emloft.net>, <edumazet@...gle.com>,
<pabeni@...hat.com>, <netdev@...r.kernel.org>, <lorenzo.bianconi@...hat.com>
Subject: Re: [RFC/RFT v2 0/3] Introduce GRO support to cpumap codebase
From: Alexander Lobakin <aleksander.lobakin@...el.com>
Date: Tue, 22 Oct 2024 17:51:43 +0200
> From: Alexander Lobakin <aleksander.lobakin@...el.com>
> Date: Wed, 9 Oct 2024 14:50:42 +0200
>
>> From: Lorenzo Bianconi <lorenzo@...nel.org>
>> Date: Wed, 9 Oct 2024 14:47:58 +0200
>>
>>>> From: Lorenzo Bianconi <lorenzo@...nel.org>
>>>> Date: Wed, 9 Oct 2024 12:46:00 +0200
>>>>
>>>>>> Hi Lorenzo,
>>>>>>
>>>>>> On Mon, Sep 16, 2024 at 12:13:42PM GMT, Lorenzo Bianconi wrote:
>>>>>>> Add GRO support to cpumap codebase moving the cpu_map_entry kthread to a
>>>>>>> NAPI-kthread pinned on the selected cpu.
>>>>>>>
>>>>>>> Changes in rfc v2:
>>>>>>> - get rid of dummy netdev dependency
>>>>>>>
>>>>>>> Lorenzo Bianconi (3):
>>>>>>> net: Add napi_init_for_gro routine
>>>>>>> net: add napi_threaded_poll to netdevice.h
>>>>>>> bpf: cpumap: Add gro support
>>>>>>>
>>>>>>> include/linux/netdevice.h | 3 +
>>>>>>> kernel/bpf/cpumap.c | 123 ++++++++++++++++----------------------
>>>>>>> net/core/dev.c | 27 ++++++---
>>>>>>> 3 files changed, 73 insertions(+), 80 deletions(-)
>>>>>>>
>>>>>>> --
>>>>>>> 2.46.0
>>>>>>>
>>>>>>
>>>>>> Sorry about the long delay - finally caught up to everything after
>>>>>> conferences.
>>>>>>
>>>>>> I re-ran my synthetic tests (including baseline). v2 is somehow showing
>>>>>> 2x bigger gains than v1 (~30% vs ~14%) for tcp_stream. Again, the only
>>>>>> variable I changed is kernel version - steering prog is active for both.
>>>>>>
>>>>>>
>>>>>> Baseline (again)
>>>>>>
>>>>>> ./tcp_rr -c -H $TASK_IP -p 50,90,99 -T4 -F8 -l30 ./tcp_stream -c -H $TASK_IP -T8 -F16 -l30
>>>>>>
>>>>>> Transactions Latency P50 (s) Latency P90 (s) Latency P99 (s) Throughput (Mbit/s)
>>>>>> Run 1 2560252 0.00009087 0.00010495 0.00011647 Run 1 15479.31
>>>>>> Run 2 2665517 0.00008575 0.00010239 0.00013311 Run 2 15162.48
>>>>>> Run 3 2755939 0.00008191 0.00010367 0.00012287 Run 3 14709.04
>>>>>> Run 4 2595680 0.00008575 0.00011263 0.00012671 Run 4 15373.06
>>>>>> Run 5 2841865 0.00007999 0.00009471 0.00012799 Run 5 15234.91
>>>>>> Average 2683850.6 0.000084854 0.00010367 0.00012543 Average 15191.76
>>>>>>
>>>>>> cpumap NAPI patches v2
>>>>>>
>>>>>> Transactions Latency P50 (s) Latency P90 (s) Latency P99 (s) Throughput (Mbit/s)
>>>>>> Run 1 2577838 0.00008575 0.00012031 0.00013695 Run 1 19914.56
>>>>>> Run 2 2729237 0.00007551 0.00013311 0.00017663 Run 2 20140.92
>>>>>> Run 3 2689442 0.00008319 0.00010495 0.00013311 Run 3 19887.48
>>>>>> Run 4 2862366 0.00008127 0.00009471 0.00010623 Run 4 19374.49
>>>>>> Run 5 2700538 0.00008319 0.00010367 0.00012799 Run 5 19784.49
>>>>>> Average 2711884.2 0.000081782 0.00011135 0.000136182 Average 19820.388
>>>>>> Delta 1.04% -3.62% 7.41% 8.57% 30.47%
>>>>>>
>>>>>> Thanks,
>>>>>> Daniel
>>>>>
>>>>> Hi Daniel,
>>>>>
>>>>> cool, thx for testing it.
>>>>>
>>>>> @Olek: how do we want to proceed on it? Are you still working on it or do you want me
>>>>> to send a regular patch for it?
>>>>
>>>> Hi,
>>>>
>>>> I had a small vacation, sorry. I'm starting working on it again today.
>>>
>>> ack, no worries. Are you going to rebase the other patches on top of it
>>> or are you going to try a different approach?
>>
>> I'll try the approach without NAPI as Kuba asks and let Daniel test it,
>> then we'll see.
>
> For now, I have the same results without NAPI as with your series, so
> I'll push it soon and let Daniel test.
>
> (I simply decoupled GRO and NAPI and used the former in cpumap, but the
> kthread logic didn't change)
>
>>
>> BTW I'm curious how he got this boost on v2, from what I see you didn't
>> change the implementation that much?
Hi Daniel,
Sorry for the delay. Please test [0].
[0] https://github.com/alobakin/linux/commits/cpumap-old
Thanks,
Olek
Powered by blists - more mailing lists