[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <47a5641b-af81-0edf-1d71-4e3063ce8517@gmail.com>
Date: Sat, 14 Dec 2019 00:32:33 +0300
From: Pavel Begunkov <asml.silence@...il.com>
To: Jens Axboe <axboe@...nel.dk>, io-uring@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/1] io_uring: don't wait when under-submitting
On 13/12/2019 21:32, Jens Axboe wrote:
> On 12/13/19 11:22 AM, Jens Axboe wrote:
>> On 12/13/19 12:51 AM, Pavel Begunkov wrote:
>>> There is no reliable way to submit and wait in a single syscall, as
>>> io_submit_sqes() may under-consume sqes (in case of an early error).
>>> Then it will wait for not-yet-submitted requests, deadlocking the user
>>> in most cases.
>>
>> Why not just cap the wait_nr? If someone does to_submit = 8, wait_nr = 8,
>> and we only submit 4, just wait for 4? Ala:
>>
Is it worth entangling the code? I don't expect anyone trying to recover,
maybe except full reset/restart. So, failing ASAP seemed to me as the
right thing to do. It may also mean nothing to the user if e.g.
submit(1), submit(1), ..., submit_and_wait(1, n)
Anyway, this shouldn't even happen in a not buggy code, so I'm fine with
any version as long as it doesn't lock up. I'll resend if you still prefer
to cap it.
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 81219a631a6d..4a76ccbb7856 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -5272,6 +5272,10 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
>> submitted = io_submit_sqes(ctx, to_submit, f.file, fd,
>> &cur_mm, false);
>> mutex_unlock(&ctx->uring_lock);
>> + if (submitted <= 0)
>> + goto done;
>> + if (submitted != to_submit && min_complete > submitted)
>> + min_complete = submitted;
>> }
>> if (flags & IORING_ENTER_GETEVENTS) {
>> unsigned nr_events = 0;
>> @@ -5284,7 +5288,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
>> ret = io_cqring_wait(ctx, min_complete, sig, sigsz);
>> }
>> }
>> -
>> +done:
>> percpu_ref_put(&ctx->refs);
>> out_fput:
>> fdput(f);
>>
>
> This is probably a bit cleaner, since it only adjusts if we're going to
> wait.
>
>
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 81219a631a6d..e262549a2601 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -5272,11 +5272,15 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
> submitted = io_submit_sqes(ctx, to_submit, f.file, fd,
> &cur_mm, false);
> mutex_unlock(&ctx->uring_lock);
> + if (submitted <= 0)
> + goto done;
> }
> if (flags & IORING_ENTER_GETEVENTS) {
> unsigned nr_events = 0;
>
> min_complete = min(min_complete, ctx->cq_entries);
> + if (submitted != to_submit && min_complete > submitted)
> + min_complete = submitted;
>
> if (ctx->flags & IORING_SETUP_IOPOLL) {
> ret = io_iopoll_check(ctx, &nr_events, min_complete);
> @@ -5284,7 +5288,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
> ret = io_cqring_wait(ctx, min_complete, sig, sigsz);
> }
> }
> -
> +done:
> percpu_ref_put(&ctx->refs);
> out_fput:
> fdput(f);
>
--
Pavel Begunkov
Powered by blists - more mailing lists