[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a2aa8de0-a2d0-3381-3415-4b523c2b66a5@kernel.dk>
Date: Mon, 20 Jul 2020 09:49:02 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Pavel Begunkov <asml.silence@...il.com>, io-uring@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/2] task_put batching
On 7/20/20 9:22 AM, Pavel Begunkov wrote:
> On 18/07/2020 17:37, Jens Axboe wrote:
>> On 7/18/20 2:32 AM, Pavel Begunkov wrote:
>>> For my a bit exaggerated test case perf continues to show high CPU
>>> cosumption by io_dismantle(), and so calling it io_iopoll_complete().
>>> Even though the patch doesn't yield throughput increase for my setup,
>>> probably because the effect is hidden behind polling, but it definitely
>>> improves relative percentage. And the difference should only grow with
>>> increasing number of CPUs. Another reason to have this is that atomics
>>> may affect other parallel tasks (e.g. which doesn't use io_uring)
>>>
>>> before:
>>> io_iopoll_complete: 5.29%
>>> io_dismantle_req: 2.16%
>>>
>>> after:
>>> io_iopoll_complete: 3.39%
>>> io_dismantle_req: 0.465%
>>
>> Still not seeing a win here, but it's clean and it _should_ work. For
>> some reason I end up getting the offset in task ref put growing the
>> fput_many(). Which doesn't (on the surface) make a lot of sense, but
>> may just mean that we have some weird side effects.
>
> It grows because the patch is garbage, the second condition is always false.
> See the diff. Could you please drop both patches?
Hah, indeed. With this on top, it looks like it should in terms of
performance and profiles.
I can just fold this into the existing one, if you'd like.
--
Jens Axboe
Powered by blists - more mailing lists