[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f9a5869-845a-f7ca-7530-49e407602023@kernel.dk>
Date: Sun, 26 Jan 2020 10:10:20 -0700
From: Jens Axboe <axboe@...nel.dk>
To: Andres Freund <andres@...razel.de>
Cc: linux-block@...r.kernel.org, io-uring <io-uring@...r.kernel.org>,
davem@...emloft.net, netdev@...r.kernel.org, jannh@...gle.com
Subject: Re: [PATCH 2/4] io_uring: io_uring: add support for async work
inheriting files
On 1/26/20 3:12 AM, Andres Freund wrote:
> Hi,
>
> On 2019-10-25 11:30:35 -0600, Jens Axboe wrote:
>> This is in preparation for adding opcodes that need to add new files
>> in a process file table, system calls like open(2) or accept4(2).
>>
>> If an opcode needs this, it must set IO_WQ_WORK_NEEDS_FILES in the work
>> item. If work that needs to get punted to async context have this
>> set, the async worker will assume the original task file table before
>> executing the work.
>>
>> Note that opcodes that need access to the current files of an
>> application cannot be done through IORING_SETUP_SQPOLL.
>
>
> Unfortunately this partially breaks sharing a uring across with forked
> off processes, even though it initially appears to work:
>
>
>> +static int io_uring_flush(struct file *file, void *data)
>> +{
>> + struct io_ring_ctx *ctx = file->private_data;
>> +
>> + io_uring_cancel_files(ctx, data);
>> + if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
>> + io_wq_cancel_all(ctx->io_wq);
>> + return 0;
>> +}
>
> Once one process having the uring fd open (even if it were just a fork
> never touching the uring, I believe) exits, this prevents the uring from
> being usable for any async tasks. The process exiting closes the fd,
> which triggers flush. io_wq_cancel_all() sets IO_WQ_BIT_CANCEL, which
> never gets unset, which causes all future async sqes to be be
> immediately returned as -ECANCELLED by the worker, via io_req_cancelled.
>
> It's not clear to me why a close() should cancel the the wq (nor clear
> the entire backlog, after 1d7bb1d50fb4)? Couldn't that even just be a
> dup()ed fd? Or a fork that immediately exec()s?
>
> After rudely ifdefing out the above if, and reverting 44d282796f81, my
> WIP io_uring using version of postgres appears to pass its tests - which
> are very sparse at this point - again with 5.5-rc7.
We need to cancel work items using the files from this process if it
exits, but I think we should be fine not canceling all work. Especially
since thet setting of IO_WQ_BIT_CANCEL is a one way street... I'm assuming
the below works for you?
diff --git a/fs/io_uring.c b/fs/io_uring.c
index e5b502091804..e3ac2a6ff195 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5044,10 +5044,8 @@ static int io_uring_flush(struct file *file, void *data)
struct io_ring_ctx *ctx = file->private_data;
io_uring_cancel_files(ctx, data);
- if (fatal_signal_pending(current) || (current->flags & PF_EXITING)) {
+ if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
io_cqring_overflow_flush(ctx, true);
- io_wq_cancel_all(ctx->io_wq);
- }
return 0;
}
--
Jens Axboe
Powered by blists - more mailing lists