linux-kernel - Re: [PATCH RFC 0/9] nxt propagation + locking optimisation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <cab8e903-fb6f-eae5-68a6-2a467160997e@gmail.com>
Date:   Sun, 1 Mar 2020 23:33:05 +0300
From:   Pavel Begunkov <asml.silence@...il.com>
To:     Jens Axboe <axboe@...nel.dk>, io-uring@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC 0/9] nxt propagation + locking optimisation

On 01/03/2020 22:14, Jens Axboe wrote:
> On 3/1/20 9:18 AM, Pavel Begunkov wrote:
>> There are several independent parts in the patchset, but bundled
>> to make a point.
>> 1-2: random stuff, that implicitly used later.
>> 3-5: restore @nxt propagation
>> 6-8: optimise locking in io_worker_handle_work()
>> 9: optimise io_uring refcounting
>>
>> The next propagation bits are done similarly as it was before, but
>> - nxt stealing is now at top-level, but not hidden in handlers
>> - ensure there is no with REQ_F_DONT_STEAL_NEXT
>>
>> [6-8] is the reason to dismiss the previous @nxt propagation appoach,
>> I didn't found a good way to do the same. Even though it looked
>> clearer and without new flag.
>>
>> Performance tested it with link-of-nops + IOSQE_ASYNC:
>>
>> link size: 100
>> orig:  501 (ns per nop)
>> 0-8:   446
>> 0-9:   416
>>
>> link size: 10
>> orig:  826
>> 0-8:   776
>> 0-9:   756
> 
> This looks nice, I'll take a closer look tomorrow or later today. Seems
> that at least patch 2 should go into 5.6 however, so may make sense to
> order the series like that.

It's the first one modifying io-wq.c, so should be fine to pick from the middle
as is.

> 
> BTW, Andres brought up a good point, and that's hashed file write works.
> Generally they complete super fast (just copying into the page cache),
> which means that that worker will be hammering the wq lock a lot. Since
> work N+1 can't make any progress before N completes (since that's how
> hashed work works), we should pull a bigger batch of these work items
> instead of just one at the time. I think that'd potentially make a huge
> difference for the performance of buffered writes.

Flashed the same thought. It should be easy enough for hashed requests. Though,
general batching would make us to think about fairness, work stealing, etc.

BTW, what's the point of hashing only heads of a link? Sounds like it can lead
to the write-write collisions, which it tries to avoid.

> 
> Just throwing it out there, since you're working in that space anyway
> and the rewards will be much larger.

I will take a look, but not sure when, I yet have some hunches myself.

-- 
Pavel Begunkov