[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <00918946-253e-43c9-a635-c91d870407b7@gmail.com>
Date: Tue, 30 Jul 2024 21:25:58 +0100
From: Pavel Begunkov <asml.silence@...il.com>
To: Olivier Langlois <olivier@...llion01.com>, io-uring@...r.kernel.org
Cc: netdev@...r.kernel.org
Subject: Re: io_uring NAPI busy poll RCU is causing 50 context switches/second
to my sqpoll thread
On 7/30/24 21:05, Olivier Langlois wrote:
> if you are interested into all the details,
>
> they are all here:
> https://github.com/axboe/liburing/issues/1190
>
> it seems like I like to write a lot when I am investigating a problem.
> Pavel has been a great help in assisting me understanding what was
> happening.
>
> Next, I came to question where the integration of RCU came from and I
> have found this:
> https://lore.kernel.org/all/89ef84bf-48c2-594c-cc9c-f796adcab5e8@kernel.dk/
>
> I guess that in some use-case being able to dynamically manage hundreds
> of NAPI devices automatically that can suddenly all be swepted over
> during a device reconfiguration is something desirable to have for
> some...
>
> but in my case, this is an excessively a high price to pay for a
> flexibility that I do not need at all.
Removing an entry or two once every minute is definitely not
going to take 50% CPU, RCU machinery is running in background
regardless of whether io_uring uses it or not, and it's pretty
cheap considering ammortisation.
If anything it more sounds from your explanation like the
scheduler makes a wrong decision and schedules out the sqpoll
thread even though it could continue to run, but that's need
a confirmation. Does the CPU your SQPOLL is pinned to stays
100% utilised?
> I have a single NAPI device. Once I know what it is, it will pratically
> remain immutable until termination.
>
> For that reason, I am thinking that offering some sort of polymorphic
> NAPI device tracking strategy customization would be desirable.
>
> The current one, the RCU one, I would call it the
>
> dynamic_napi_tracking (rcu could be peppered in the name somewhere so
> people know what the strategy is up to)
>
> where as the new one that I am imagining would be called
>
> static_napi_tracking.
>
> NAPI devices would be added/removed by the user manually through an
> extended registration function.
>
> for the sake of conveniance, a clear_list operation could even be
> offered.
>
> The benefits of this new static tracking strategy would be numerous:
> - this removes the need to invoke the heavy duty RCU cavalry
> - no need to scan the list to remove stall devices
> - no need to search the list at each SQE submission to update the
> device timeout value
>
> So is this a good idea in your opinion?
I believe that's a good thing, I've been prototyping a similar
if not the same approach just today, i.e. user [un]registers
napi instance by id you can get with SO_INCOMING_NAPI_ID.
--
Pavel Begunkov
Powered by blists - more mailing lists