lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00918946-253e-43c9-a635-c91d870407b7@gmail.com>
Date: Tue, 30 Jul 2024 21:25:58 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Olivier Langlois <olivier@...llion01.com>, io-uring@...r.kernel.org
Cc: netdev@...r.kernel.org
Subject: Re: io_uring NAPI busy poll RCU is causing 50 context switches/second
 to my sqpoll thread

On 7/30/24 21:05, Olivier Langlois wrote:
> if you are interested into all the details,
> 
> they are all here:
> https://github.com/axboe/liburing/issues/1190
> 
> it seems like I like to write a lot when I am investigating a problem.
> Pavel has been a great help in assisting me understanding what was
> happening.
> 
> Next, I came to question where the integration of RCU came from and I
> have found this:
> https://lore.kernel.org/all/89ef84bf-48c2-594c-cc9c-f796adcab5e8@kernel.dk/
> 
> I guess that in some use-case being able to dynamically manage hundreds
> of NAPI devices automatically that can suddenly all be swepted over
> during a device reconfiguration is something desirable to have for
> some...
> 
> but in my case, this is an excessively a high price to pay for a
> flexibility that I do not need at all.

Removing an entry or two once every minute is definitely not
going to take 50% CPU, RCU machinery is running in background
regardless of whether io_uring uses it or not, and it's pretty
cheap considering ammortisation.

If anything it more sounds from your explanation like the
scheduler makes a wrong decision and schedules out the sqpoll
thread even though it could continue to run, but that's need
a confirmation. Does the CPU your SQPOLL is pinned to stays
100% utilised?


> I have a single NAPI device. Once I know what it is, it will pratically
> remain immutable until termination.
> 
> For that reason, I am thinking that offering some sort of polymorphic
> NAPI device tracking strategy customization would be desirable.
> 
> The current one, the RCU one, I would call it the
> 
> dynamic_napi_tracking (rcu could be peppered in the name somewhere so
> people know what the strategy is up to)
> 
> where as the new one that I am imagining would be called
> 
> static_napi_tracking.
> 
> NAPI devices would be added/removed by the user manually through an
> extended registration function.
> 
> for the sake of conveniance, a clear_list operation could even be
> offered.
> 
> The benefits of this new static tracking strategy would be numerous:
> - this removes the need to invoke the heavy duty RCU cavalry
> - no need to scan the list to remove stall devices
> - no need to search the list at each SQE submission to update the
> device timeout value
> 
> So is this a good idea in your opinion?

I believe that's a good thing, I've been prototyping a similar
if not the same approach just today, i.e. user [un]registers
napi instance by id you can get with SO_INCOMING_NAPI_ID.

-- 
Pavel Begunkov

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ