linux-kernel - Re: [PATCH v3 5/7] iomap: Support restarting direct I/O requests after user copy failures

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20210726171940.GM20621@quack2.suse.cz>
Date:   Mon, 26 Jul 2021 19:19:40 +0200
From:   Jan Kara <jack@...e.cz>
To:     Andreas Gruenbacher <agruenba@...hat.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Christoph Hellwig <hch@...radead.org>,
        "Darrick J. Wong" <djwong@...nel.org>, Jan Kara <jack@...e.cz>,
        Matthew Wilcox <willy@...radead.org>, cluster-devel@...hat.com,
        linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
        ocfs2-devel@....oracle.com
Subject: Re: [PATCH v3 5/7] iomap: Support restarting direct I/O requests
 after user copy failures

On Fri 23-07-21 22:58:38, Andreas Gruenbacher wrote:
> In __iomap_dio_rw, when iomap_apply returns an -EFAULT error, complete the
> request synchronously and reset the iterator to the start position.  This
> allows callers to deal with the failure and retry the operation.
> 
> In gfs2, we need to disable page faults while we're holding glocks to prevent
> deadlocks.  This patch is the minimum solution I could find to make
> iomap_dio_rw work with page faults disabled.  It's still expensive because any
> I/O that was carried out before hitting -EFAULT needs to be retried.
> 
> A possible improvement would be to add an IOMAP_DIO_FAULT_RETRY or similar flag
> that would allow iomap_dio_rw to return a short result when hitting -EFAULT.
> Callers could then retry only the rest of the request after dealing with the
> page fault.
> 
> Asynchronous requests turn into synchronous requests up to the point of the
> page fault in any case, but they could be retried asynchronously after dealing
> with the page fault.  To make that work, the completion notification would have
> to include the bytes read or written before the page fault(s) as well, and we'd
> need an additional iomap_dio_rw argument for that.
> 
> Signed-off-by: Andreas Gruenbacher <agruenba@...hat.com>
> ---
>  fs/iomap/direct-io.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
> index cc0b4bc8861b..b0a494211bb4 100644
> --- a/fs/iomap/direct-io.c
> +++ b/fs/iomap/direct-io.c
> @@ -561,6 +561,15 @@ __iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter,
>  		ret = iomap_apply(inode, pos, count, iomap_flags, ops, dio,
>  				iomap_dio_actor);
>  		if (ret <= 0) {
> +			if (ret == -EFAULT) {
> +				/*
> +				 * To allow retrying the request, fail
> +				 * synchronously and reset the iterator.
> +				 */
> +				wait_for_completion = true;
> +				iov_iter_revert(dio->submit.iter, dio->size);
> +			}
> +

Hum, OK, but this means that if userspace submits large enough write, GFS2
will livelock trying to complete it? While other filesystems can just
submit multiple smaller bios constructed in iomap_apply() (paging in
different parts of the buffer) and thus complete the write?

								Honza

>  			/* magic error code to fall back to buffered I/O */
>  			if (ret == -ENOTBLK) {
>  				wait_for_completion = true;
> -- 
> 2.26.3
> 
-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR