netdev - Re: [PATCH 2/4] io_uring: io_uring: add support for async work inheriting files

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20200126101207.oqovstqfr4iddc3p@alap3.anarazel.de>
Date:   Sun, 26 Jan 2020 02:12:07 -0800
From:   Andres Freund <andres@...razel.de>
To:     Jens Axboe <axboe@...nel.dk>
Cc:     linux-block@...r.kernel.org, io-uring <io-uring@...r.kernel.org>,
        davem@...emloft.net, netdev@...r.kernel.org, jannh@...gle.com
Subject: Re: [PATCH 2/4] io_uring: io_uring: add support for async work
 inheriting files

Hi,

On 2019-10-25 11:30:35 -0600, Jens Axboe wrote:
> This is in preparation for adding opcodes that need to add new files
> in a process file table, system calls like open(2) or accept4(2).
> 
> If an opcode needs this, it must set IO_WQ_WORK_NEEDS_FILES in the work
> item. If work that needs to get punted to async context have this
> set, the async worker will assume the original task file table before
> executing the work.
> 
> Note that opcodes that need access to the current files of an
> application cannot be done through IORING_SETUP_SQPOLL.

Unfortunately this partially breaks sharing a uring across with forked
off processes, even though it initially appears to work:

> +static int io_uring_flush(struct file *file, void *data)
> +{
> +	struct io_ring_ctx *ctx = file->private_data;
> +
> +	io_uring_cancel_files(ctx, data);
> +	if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
> +		io_wq_cancel_all(ctx->io_wq);
> +	return 0;
> +}

Once one process having the uring fd open (even if it were just a fork
never touching the uring, I believe) exits, this prevents the uring from
being usable for any async tasks. The process exiting closes the fd,
which triggers flush. io_wq_cancel_all() sets IO_WQ_BIT_CANCEL, which
never gets unset, which causes all future async sqes to be be
immediately returned as -ECANCELLED by the worker, via io_req_cancelled.

It's not clear to me why a close() should cancel the the wq (nor clear
the entire backlog, after 1d7bb1d50fb4)? Couldn't that even just be a
dup()ed fd? Or a fork that immediately exec()s?

After rudely ifdefing out the above if, and reverting 44d282796f81, my
WIP io_uring using version of postgres appears to pass its tests - which
are very sparse at this point - again with 5.5-rc7.

Greetings,

Andres Freund