lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c15b2d54-c722-8fb4-266f-b589c1a21aa5@gmail.com>
Date:   Mon, 23 Sep 2019 19:21:51 +0300
From:   Pavel Begunkov <asml.silence@...il.com>
To:     Ingo Molnar <mingo@...nel.org>, Jens Axboe <axboe@...nel.dk>
Cc:     Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/2] Optimise io_uring completion waiting

Hi, and thanks for the feedback.

It could be done with @cond indeed, that's how it works for now.
However, this addresses performance issues only.

The problem with wait_event_*() is that, if we have a counter and are
trying to wake up tasks after each increment, it would schedule each
waiting task O(threshold) times just for it to spuriously check @cond
and go back to sleep. All that overhead (memory barriers, registers
save/load, accounting, etc) turned out to be enough for some workloads
to slow down the system.

With this specialisation it still traverses a wait list and makes
indirect calls to the checker callback, but the list supposedly is
fairly  small, so performance there shouldn't be a problem, at least for
now.

Regarding semantics; It should wake a task when a value passed to
wake_up_threshold() is greater or equal then a task's threshold, that is
specified individually for each task in wait_threshold_*().

In pseudo code:
```
def wake_up_threshold(n, wait_queue):
	for waiter in wait_queue:
		waiter.wake_up_if(n >= waiter.threshold);
```

Any thoughts how to do it better? Ideas are very welcome.

BTW, this monster is mostly a copy-paste from wait_event_*(),
wait_bit_*(). We could try to extract some common parts from these
three, but that's another topic.


On 23/09/2019 11:35, Ingo Molnar wrote:
> 
> * Jens Axboe <axboe@...nel.dk> wrote:
> 
>> On 9/22/19 2:08 AM, Pavel Begunkov (Silence) wrote:
>>> From: Pavel Begunkov <asml.silence@...il.com>
>>>
>>> There could be a lot of overhead within generic wait_event_*() used for
>>> waiting for large number of completions. The patchset removes much of
>>> it by using custom wait event (wait_threshold).
>>>
>>> Synthetic test showed ~40% performance boost. (see patch 2)
>>
>> I'm fine with the io_uring side of things, but to queue this up we
>> really need Peter or Ingo to sign off on the core wakeup bits...
>>
>> Peter?
> 
> I'm not sure an extension is needed for such a special interface, why not 
> just put a ->threshold value next to the ctx->wait field and use either 
> the regular wait_event() APIs with the proper condition, or 
> wait_event_cmd() style APIs if you absolutely need something more complex 
> to happen inside?
> 
> Should result in a much lower linecount and no scheduler changes. :-)
> 
> Thanks,
> 
> 	Ingo
> 

-- 
Yours sincerely,
Pavel Begunkov



Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ