lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <82f95b0f-8dd5-2fb5-7e17-b77072e86093@kernel.dk>
Date:   Tue, 24 Sep 2019 19:46:21 +0200
From:   Jens Axboe <axboe@...nel.dk>
To:     Pavel Begunkov <asml.silence@...il.com>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     Ingo Molnar <mingo@...nel.org>, Ingo Molnar <mingo@...hat.com>,
        linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 0/2] Optimise io_uring completion waiting

On 9/24/19 11:33 AM, Pavel Begunkov wrote:
> On 24/09/2019 16:13, Jens Axboe wrote:
>> On 9/24/19 5:23 AM, Pavel Begunkov wrote:
>>>> Yep that should do it, and saves 8 bytes of stack as well.
>>>>
>>>> BTW, did you test my patch, this one or the previous? Just curious if it
>>>> worked for you.
>>>>
>>> Not yet, going to do that tonight
>>
>> Thanks! For reference, the final version is below. There was still a
>> signal mishap in there, now it should all be correct afaict.
>>
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 9b84232e5cc4..d2a86164d520 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -2768,6 +2768,38 @@ static int io_ring_submit(struct io_ring_ctx *ctx, unsigned int to_submit,
>>   	return submit;
>>   }
>>   
>> +struct io_wait_queue {
>> +	struct wait_queue_entry wq;
>> +	struct io_ring_ctx *ctx;
>> +	unsigned to_wait;
>> +	unsigned nr_timeouts;
>> +};
>> +
>> +static inline bool io_should_wake(struct io_wait_queue *iowq)
>> +{
>> +	struct io_ring_ctx *ctx = iowq->ctx;
>> +
>> +	/*
>> +	 * Wake up if we have enough events, or if a timeout occured since we
>> +	 * started waiting. For timeouts, we always want to return to userspace,
>> +	 * regardless of event count.
>> +	 */
>> +	return io_cqring_events(ctx->rings) >= iowq->to_wait ||
>> +			atomic_read(&ctx->cq_timeouts) != iowq->nr_timeouts;
>> +}
>> +
>> +static int io_wake_function(struct wait_queue_entry *curr, unsigned int mode,
>> +			    int wake_flags, void *key)
>> +{
>> +	struct io_wait_queue *iowq = container_of(curr, struct io_wait_queue,
>> +							wq);
>> +
>> +	if (!io_should_wake(iowq))
>> +		return -1;
> 
> It would try to schedule only the first task in the wait list. Is that the
> semantic you want?
> E.g. for waiters=[32,8] and nr_events == 8, io_wake_function() returns
> after @32, and won't wake up the second one.

Right, those are the semantics I want. We keep the list ordered by using
the exclusive wait addition. Which means that for the case you list,
waiters=32 came first, and we should not wake others before that task
gets the completions it wants. Otherwise we could potentially starve
higher count waiters, if we always keep going and new waiters come in.

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ