linux-kernel - Re: [PATCH] Add provision to busyloop for events in ep

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <30ddb66a-aeea-480d-bf79-38fc06ea45b0@uwaterloo.ca>
Date: Wed, 4 Sep 2024 08:46:10 -0400
From: Martin Karsten <mkarsten@...terloo.ca>
To: Naman Gulati <namangulati@...gle.com>, Joe Damato <jdamato@...tly.com>,
 Alexander Viro <viro@...iv.linux.org.uk>,
 Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
 "David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
 netdev@...r.kernel.org, Stanislav Fomichev <sdf@...ichev.me>,
 linux-kernel@...r.kernel.org, skhawaja@...gle.com,
 Willem de Bruijn <willemdebruijn.kernel@...il.com>
Subject: Re: [PATCH] Add provision to busyloop for events in ep_poll.

On 2024-09-04 01:52, Naman Gulati wrote:
> Thanks all for the comments and apologies for the delay in replying.
> Stan and Joe I’ve addressed some of the common concerns below.
> 
> On Thu, Aug 29, 2024 at 3:40 AM Joe Damato <jdamato@...tly.com> wrote:
>>
>> On Wed, Aug 28, 2024 at 06:10:11PM +0000, Naman Gulati wrote:
>>> NAPI busypolling in ep_busy_loop loops on napi_poll and checks for new
>>> epoll events after every napi poll. Checking just for epoll events in a
>>> tight loop in the kernel context delivers latency gains to applications
>>> that are not interested in napi busypolling with epoll.
>>>
>>> This patch adds an option to loop just for new events inside
>>> ep_busy_loop, guarded by the EPIOCSPARAMS ioctl that controls epoll napi
>>> busypolling.
>>
>> This makes an API change, so I think that linux-api@...r.kernel.org
>> needs to be CC'd ?
>>
>>> A comparison with neper tcp_rr shows that busylooping for events in
>>> epoll_wait boosted throughput by ~3-7% and reduced median latency by
>>> ~10%.
>>>
>>> To demonstrate the latency and throughput improvements, a comparison was
>>> made of neper tcp_rr running with:
>>>      1. (baseline) No busylooping
>>
>> Is there NAPI-based steering to threads via SO_INCOMING_NAPI_ID in
>> this case? More details, please, on locality. If there is no
>> NAPI-based flow steering in this case, perhaps the improvements you
>> are seeing are a result of both syscall overhead avoidance and data
>> locality?
>>
> 
> The benchmarks were run with no NAPI steering.
> 
> Regarding syscall overhead, I reproduced the above experiment with
> mitigations=off
> and found similar results as above. Pointing to the fact that the
> above gains are
> materialized from more than just avoiding syscall overhead.

I suppose the natural follow-up questions are:

1) Where do the gains come from? and

2) Would they materialize with a realistic application?

System calls have some overhead even with mitigations=off. In fact I 
understand on modern CPUs security mitigations are not that expensive to 
begin with? In a micro-benchmark that does nothing else but bouncing 
packets back and forth, this overhead might look more significant than 
in a realistic application?

It seems your change does not eliminate any processing from each 
packet's path, but instead eliminates processing in between packet 
arrivals? This might lead to a small latency improvement, which might 
turn into a small throughput improvement in these micro-benchmarks, but 
that might quickly evaporate when an application has actual work to do 
in between packet arrivals.

It would be good to know a little more about your experiments. You are 
referring to 5 threads, but does that mean 5 cores were busy on both 
client and server during the experiment? Which of client or server is 
the bottleneck? In your baseline experiment, are all 5 server cores 
busy? How many RX queues are in play and how is interrupt routing 
configured?

Thanks,
Martin