lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Sun, 18 Feb 2024 11:13:30 +0800
From: Tong Tiangen <tongtiangen@...wei.com>
To: David Howells <dhowells@...hat.com>, Jens Axboe <axboe@...nel.dk>
CC: Al Viro <viro@...iv.linux.org.uk>, Linus Torvalds
	<torvalds@...ux-foundation.org>, Christoph Hellwig <hch@....de>, Christian
 Brauner <christian@...uner.io>, David Laight <David.Laight@...LAB.COM>,
	Matthew Wilcox <willy@...radead.org>, Jeff Layton <jlayton@...nel.org>,
	<linux-fsdevel@...r.kernel.org>, <linux-block@...r.kernel.org>,
	<linux-mm@...ck.org>, <netdev@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, Kefeng Wang <wangkefeng.wang@...wei.com>
Subject: [bug report] dead loop in generic_perform_write() //Re: [PATCH v7
 07/12] iov_iter: Convert iterate*() to inline funcs

Hi David, Jens:

Recently, I tested the x86 coredump function of the user process in the
mainline (6.8-rc1) and found an deadloop issue related to this patch.

Let's discuss it.

1. Test step:
----------------------------
   a. Start a user process.
   b. Use EINJ to inject a hardware memory error into a page of
      the this user process.
   c. Send SIGBUS to this user process.
   d. After receiving the signal, a coredump file is configured to be
      written to tmpfs.

2. Root cause:
----------------------------
The deadloop occurs in generic_perform_write(), the call path:

elf_core_dump()
   -> dump_user_range()
     -> dump_emit_page()
       -> iov_iter_bvec()  //iter type set to BVEC
         -> iov_iter_set_copy_mc(&iter);  //support copy mc
           -> __kernel_write_iter()
             -> shmem_file_write_iter()
               -> generic_perform_write()

ssize_t generic_perform_write(...)
{
	[...]
	do {
		[...]
	again:
		//[4]
		if (unlikely(fault_in_iov_iter_readable(i, bytes) ==
                              bytes)) {
			status = -EFAULT;
			break;
		}
		//[5]
		if (fatal_signal_pending(current)) {
			status = -EINTR;
			break;
		}
		
	        [...]
		
		//[1]
		copied = copy_page_from_iter_atomic(page, offset, bytes,
                          i);
		[...]
		
		//[2]
		status = a_ops->write_end(...);
		if (unlikely(status != copied)) {
			iov_iter_revert(i, copied - max(status, 0L));
			if (unlikely(status < 0))
				break;
		}
		cond_resched();
		
		if (unlikely(status == 0)) {
			/*
			* A short copy made ->write_end() reject the
			* thing entirely.  Might be memory poisoning
			* halfway through, might be a race with munmap,
			* might be severe memory pressure.
			*/
			if (copied)
				bytes = copied;
			//----[3]
			goto again;
		}
		[...]
	} while (iov_iter_count(i));
	[...]
}

[1]Before this patch:
   copy_page_from_iter_atomic()
     -> iterate_and_advance()
        -> __iterate_and_advance(..., ((void)(K),0))
          ->iterate_bvec macro
            -> left = ((void)(K),0)

With CONFIG_ARCH_HAS_COPY_MC, the K() is copy_mc_to_kernel() which
return "bytes not copied".

When a memory error occurs during K(), the value of "left" must be 0.
Therefore, the value of "copied" returned by
copy_page_from_iter_atomic() is not 0, and the loop of
generic_perform_write() can be ended normally.


After this patch:
   copy_page_from_iter_atomic()
     -> iterate_and_advance2()
       -> iterate_bvec()
         -> remain = step()

With CONFIG_ARCH_HAS_COPY_MC, the step() is copy_mc_to_kernel() which
return "bytes not copied".

When a memory error occurs during step(), the value of "left" equal to
the value of "part" (no one byte is copied successfully). In this case,
iterate_bvec() returns 0, and copy_page_from_iter_atomic() also returns
0. The callback shmem_write_end()[2] also returns 0. Finally,
generic_perform_write() goes to "goto again"[3], and the loop restarts.
4][5] cannot enter and exit the loop, then deadloop occurs.

Thanks.
Tong


在 2023/9/25 20:03, David Howells 写道:
> Convert the iov_iter iteration macros to inline functions to make the code
> easier to follow.
> 
> The functions are marked __always_inline as we don't want to end up with
> indirect calls in the code.  This, however, leaves dealing with ->copy_mc
> in an awkard situation since the step function (memcpy_from_iter_mc())
> needs to test the flag in the iterator, but isn't passed the iterator.
> This will be dealt with in a follow-up patch.
> 
> The variable names in the per-type iterator functions have been harmonised
> as much as possible and made clearer as to the variable purpose.
> 
> The iterator functions are also moved to a header file so that other
> operations that need to scan over an iterator can be added.  For instance,
> the rbd driver could use this to scan a buffer to see if it is all zeros
> and libceph could use this to generate a crc.
> 
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Alexander Viro <viro@...iv.linux.org.uk>
> cc: Jens Axboe <axboe@...nel.dk>
> cc: Christoph Hellwig <hch@....de>
> cc: Christian Brauner <christian@...uner.io>
> cc: Matthew Wilcox <willy@...radead.org>
> cc: Linus Torvalds <torvalds@...ux-foundation.org>
> cc: David Laight <David.Laight@...LAB.COM>
> cc: linux-block@...r.kernel.org
> cc: linux-fsdevel@...r.kernel.org
> cc: linux-mm@...ck.org
> Link: https://lore.kernel.org/r/3710261.1691764329@warthog.procyon.org.uk/ # v1
> Link: https://lore.kernel.org/r/855.1692047347@warthog.procyon.org.uk/ # v2
> Link: https://lore.kernel.org/r/20230816120741.534415-1-dhowells@redhat.com/ # v3
> ---
> 
> Notes:
>      Changes
>      =======
>      ver #5)
>       - Merge in patch to move iteration framework to a header file.
>       - Move "iter->count - progress" into individual iteration subfunctions.
> 
>   include/linux/iov_iter.h | 274 ++++++++++++++++++++++++++
>   lib/iov_iter.c           | 416 ++++++++++++++++-----------------------
>   2 files changed, 449 insertions(+), 241 deletions(-)
>   create mode 100644 include/linux/iov_iter.h
> 
> diff --git a/include/linux/iov_iter.h b/include/linux/iov_iter.h
> new file mode 100644
> index 000000000000..270454a6703d
> --- /dev/null
> +++ b/include/linux/iov_iter.h
> @@ -0,0 +1,274 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/* I/O iterator iteration building functions.
> + *
> + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
> + * Written by David Howells (dhowells@...hat.com)
> + */
> +
> +#ifndef _LINUX_IOV_ITER_H
> +#define _LINUX_IOV_ITER_H
> +
> +#include <linux/uio.h>
> +#include <linux/bvec.h>
> +
> +typedef size_t (*iov_step_f)(void *iter_base, size_t progress, size_t len,
> +			     void *priv, void *priv2);
> +typedef size_t (*iov_ustep_f)(void __user *iter_base, size_t progress, size_t len,
> +			      void *priv, void *priv2);
> +
> +/*
> + * Handle ITER_UBUF.
> + */
> +static __always_inline
> +size_t iterate_ubuf(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> +		    iov_ustep_f step)
> +{
> +	void __user *base = iter->ubuf;
> +	size_t progress = 0, remain;
> +
> +	remain = step(base + iter->iov_offset, 0, len, priv, priv2);
> +	progress = len - remain;
> +	iter->iov_offset += progress;
> +	iter->count -= progress;
> +	return progress;
> +}
> +
> +/*
> + * Handle ITER_IOVEC.
> + */
> +static __always_inline
> +size_t iterate_iovec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> +		     iov_ustep_f step)
> +{
> +	const struct iovec *p = iter->__iov;
> +	size_t progress = 0, skip = iter->iov_offset;
> +
> +	do {
> +		size_t remain, consumed;
> +		size_t part = min(len, p->iov_len - skip);
> +
> +		if (likely(part)) {
> +			remain = step(p->iov_base + skip, progress, part, priv, priv2);
> +			consumed = part - remain;
> +			progress += consumed;
> +			skip += consumed;
> +			len -= consumed;
> +			if (skip < p->iov_len)
> +				break;
> +		}
> +		p++;
> +		skip = 0;
> +	} while (len);
> +
> +	iter->nr_segs -= p - iter->__iov;
> +	iter->__iov = p;
> +	iter->iov_offset = skip;
> +	iter->count -= progress;
> +	return progress;
> +}
> +
> +/*
> + * Handle ITER_KVEC.
> + */
> +static __always_inline
> +size_t iterate_kvec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> +		    iov_step_f step)
> +{
> +	const struct kvec *p = iter->kvec;
> +	size_t progress = 0, skip = iter->iov_offset;
> +
> +	do {
> +		size_t remain, consumed;
> +		size_t part = min(len, p->iov_len - skip);
> +
> +		if (likely(part)) {
> +			remain = step(p->iov_base + skip, progress, part, priv, priv2);
> +			consumed = part - remain;
> +			progress += consumed;
> +			skip += consumed;
> +			len -= consumed;
> +			if (skip < p->iov_len)
> +				break;
> +		}
> +		p++;
> +		skip = 0;
> +	} while (len);
> +
> +	iter->nr_segs -= p - iter->kvec;
> +	iter->kvec = p;
> +	iter->iov_offset = skip;
> +	iter->count -= progress;
> +	return progress;
> +}
> +
> +/*
> + * Handle ITER_BVEC.
> + */
> +static __always_inline
> +size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> +		    iov_step_f step)
> +{
> +	const struct bio_vec *p = iter->bvec;
> +	size_t progress = 0, skip = iter->iov_offset;
> +
> +	do {
> +		size_t remain, consumed;
> +		size_t offset = p->bv_offset + skip, part;
> +		void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE);
> +
> +		part = min3(len,
> +			   (size_t)(p->bv_len - skip),
> +			   (size_t)(PAGE_SIZE - offset % PAGE_SIZE));
> +		remain = step(kaddr + offset % PAGE_SIZE, progress, part, priv, priv2);
> +		kunmap_local(kaddr);
> +		consumed = part - remain;
> +		len -= consumed;
> +		progress += consumed;
> +		skip += consumed;
> +		if (skip >= p->bv_len) {
> +			skip = 0;
> +			p++;
> +		}
> +		if (remain)
> +			break;
> +	} while (len);
> +
> +	iter->nr_segs -= p - iter->bvec;
> +	iter->bvec = p;
> +	iter->iov_offset = skip;
> +	iter->count -= progress;
> +	return progress;
> +}
> +
> +/*
> + * Handle ITER_XARRAY.
> + */
> +static __always_inline
> +size_t iterate_xarray(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> +		      iov_step_f step)
> +{
> +	struct folio *folio;
> +	size_t progress = 0;
> +	loff_t start = iter->xarray_start + iter->iov_offset;
> +	pgoff_t index = start / PAGE_SIZE;
> +	XA_STATE(xas, iter->xarray, index);
> +
> +	rcu_read_lock();
> +	xas_for_each(&xas, folio, ULONG_MAX) {
> +		size_t remain, consumed, offset, part, flen;
> +
> +		if (xas_retry(&xas, folio))
> +			continue;
> +		if (WARN_ON(xa_is_value(folio)))
> +			break;
> +		if (WARN_ON(folio_test_hugetlb(folio)))
> +			break;
> +
> +		offset = offset_in_folio(folio, start + progress);
> +		flen = min(folio_size(folio) - offset, len);
> +
> +		while (flen) {
> +			void *base = kmap_local_folio(folio, offset);
> +
> +			part = min_t(size_t, flen,
> +				     PAGE_SIZE - offset_in_page(offset));
> +			remain = step(base, progress, part, priv, priv2);
> +			kunmap_local(base);
> +
> +			consumed = part - remain;
> +			progress += consumed;
> +			len -= consumed;
> +
> +			if (remain || len == 0)
> +				goto out;
> +			flen -= consumed;
> +			offset += consumed;
> +		}
> +	}
> +
> +out:
> +	rcu_read_unlock();
> +	iter->iov_offset += progress;
> +	iter->count -= progress;
> +	return progress;
> +}
> +
> +/*
> + * Handle ITER_DISCARD.
> + */
> +static __always_inline
> +size_t iterate_discard(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> +		      iov_step_f step)
> +{
> +	size_t progress = len;
> +
> +	iter->count -= progress;
> +	return progress;
> +}
> +
> +/**
> + * iterate_and_advance2 - Iterate over an iterator
> + * @iter: The iterator to iterate over.
> + * @len: The amount to iterate over.
> + * @priv: Data for the step functions.
> + * @priv2: More data for the step functions.
> + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
> + * @step: Function for other iterators; given kernel addresses.
> + *
> + * Iterate over the next part of an iterator, up to the specified length.  The
> + * buffer is presented in segments, which for kernel iteration are broken up by
> + * physical pages and mapped, with the mapped address being presented.
> + *
> + * Two step functions, @step and @ustep, must be provided, one for handling
> + * mapped kernel addresses and the other is given user addresses which have the
> + * potential to fault since no pinning is performed.
> + *
> + * The step functions are passed the address and length of the segment, @priv,
> + * @priv2 and the amount of data so far iterated over (which can, for example,
> + * be added to @priv to point to the right part of a second buffer).  The step
> + * functions should return the amount of the segment they didn't process (ie. 0
> + * indicates complete processsing).
> + *
> + * This function returns the amount of data processed (ie. 0 means nothing was
> + * processed and the value of @len means processes to completion).
> + */
> +static __always_inline
> +size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void *priv,
> +			    void *priv2, iov_ustep_f ustep, iov_step_f step)
> +{
> +	if (unlikely(iter->count < len))
> +		len = iter->count;
> +	if (unlikely(!len))
> +		return 0;
> +
> +	if (likely(iter_is_ubuf(iter)))
> +		return iterate_ubuf(iter, len, priv, priv2, ustep);
> +	if (likely(iter_is_iovec(iter)))
> +		return iterate_iovec(iter, len, priv, priv2, ustep);
> +	if (iov_iter_is_bvec(iter))
> +		return iterate_bvec(iter, len, priv, priv2, step);
> +	if (iov_iter_is_kvec(iter))
> +		return iterate_kvec(iter, len, priv, priv2, step);
> +	if (iov_iter_is_xarray(iter))
> +		return iterate_xarray(iter, len, priv, priv2, step);
> +	return iterate_discard(iter, len, priv, priv2, step);
> +}
> +
> +/**
> + * iterate_and_advance - Iterate over an iterator
> + * @iter: The iterator to iterate over.
> + * @len: The amount to iterate over.
> + * @priv: Data for the step functions.
> + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
> + * @step: Function for other iterators; given kernel addresses.
> + *
> + * As iterate_and_advance2(), but priv2 is always NULL.
> + */
> +static __always_inline
> +size_t iterate_and_advance(struct iov_iter *iter, size_t len, void *priv,
> +			   iov_ustep_f ustep, iov_step_f step)
> +{
> +	return iterate_and_advance2(iter, len, priv, NULL, ustep, step);
> +}
> +
> +#endif /* _LINUX_IOV_ITER_H */
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 227c9f536b94..65374ee91ecd 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -13,189 +13,69 @@
>   #include <net/checksum.h>
>   #include <linux/scatterlist.h>
>   #include <linux/instrumented.h>
> +#include <linux/iov_iter.h>
>   
> -/* covers ubuf and kbuf alike */
> -#define iterate_buf(i, n, base, len, off, __p, STEP) {		\
> -	size_t __maybe_unused off = 0;				\
> -	len = n;						\
> -	base = __p + i->iov_offset;				\
> -	len -= (STEP);						\
> -	i->iov_offset += len;					\
> -	n = len;						\
> -}
> -
> -/* covers iovec and kvec alike */
> -#define iterate_iovec(i, n, base, len, off, __p, STEP) {	\
> -	size_t off = 0;						\
> -	size_t skip = i->iov_offset;				\
> -	do {							\
> -		len = min(n, __p->iov_len - skip);		\
> -		if (likely(len)) {				\
> -			base = __p->iov_base + skip;		\
> -			len -= (STEP);				\
> -			off += len;				\
> -			skip += len;				\
> -			n -= len;				\
> -			if (skip < __p->iov_len)		\
> -				break;				\
> -		}						\
> -		__p++;						\
> -		skip = 0;					\
> -	} while (n);						\
> -	i->iov_offset = skip;					\
> -	n = off;						\
> -}
> -
> -#define iterate_bvec(i, n, base, len, off, p, STEP) {		\
> -	size_t off = 0;						\
> -	unsigned skip = i->iov_offset;				\
> -	while (n) {						\
> -		unsigned offset = p->bv_offset + skip;		\
> -		unsigned left;					\
> -		void *kaddr = kmap_local_page(p->bv_page +	\
> -					offset / PAGE_SIZE);	\
> -		base = kaddr + offset % PAGE_SIZE;		\
> -		len = min(min(n, (size_t)(p->bv_len - skip)),	\
> -		     (size_t)(PAGE_SIZE - offset % PAGE_SIZE));	\
> -		left = (STEP);					\
> -		kunmap_local(kaddr);				\
> -		len -= left;					\
> -		off += len;					\
> -		skip += len;					\
> -		if (skip == p->bv_len) {			\
> -			skip = 0;				\
> -			p++;					\
> -		}						\
> -		n -= len;					\
> -		if (left)					\
> -			break;					\
> -	}							\
> -	i->iov_offset = skip;					\
> -	n = off;						\
> -}
> -
> -#define iterate_xarray(i, n, base, len, __off, STEP) {		\
> -	__label__ __out;					\
> -	size_t __off = 0;					\
> -	struct folio *folio;					\
> -	loff_t start = i->xarray_start + i->iov_offset;		\
> -	pgoff_t index = start / PAGE_SIZE;			\
> -	XA_STATE(xas, i->xarray, index);			\
> -								\
> -	len = PAGE_SIZE - offset_in_page(start);		\
> -	rcu_read_lock();					\
> -	xas_for_each(&xas, folio, ULONG_MAX) {			\
> -		unsigned left;					\
> -		size_t offset;					\
> -		if (xas_retry(&xas, folio))			\
> -			continue;				\
> -		if (WARN_ON(xa_is_value(folio)))		\
> -			break;					\
> -		if (WARN_ON(folio_test_hugetlb(folio)))		\
> -			break;					\
> -		offset = offset_in_folio(folio, start + __off);	\
> -		while (offset < folio_size(folio)) {		\
> -			base = kmap_local_folio(folio, offset);	\
> -			len = min(n, len);			\
> -			left = (STEP);				\
> -			kunmap_local(base);			\
> -			len -= left;				\
> -			__off += len;				\
> -			n -= len;				\
> -			if (left || n == 0)			\
> -				goto __out;			\
> -			offset += len;				\
> -			len = PAGE_SIZE;			\
> -		}						\
> -	}							\
> -__out:								\
> -	rcu_read_unlock();					\
> -	i->iov_offset += __off;					\
> -	n = __off;						\
> -}
> -
> -#define __iterate_and_advance(i, n, base, len, off, I, K) {	\
> -	if (unlikely(i->count < n))				\
> -		n = i->count;					\
> -	if (likely(n)) {					\
> -		if (likely(iter_is_ubuf(i))) {			\
> -			void __user *base;			\
> -			size_t len;				\
> -			iterate_buf(i, n, base, len, off,	\
> -						i->ubuf, (I)) 	\
> -		} else if (likely(iter_is_iovec(i))) {		\
> -			const struct iovec *iov = iter_iov(i);	\
> -			void __user *base;			\
> -			size_t len;				\
> -			iterate_iovec(i, n, base, len, off,	\
> -						iov, (I))	\
> -			i->nr_segs -= iov - iter_iov(i);	\
> -			i->__iov = iov;				\
> -		} else if (iov_iter_is_bvec(i)) {		\
> -			const struct bio_vec *bvec = i->bvec;	\
> -			void *base;				\
> -			size_t len;				\
> -			iterate_bvec(i, n, base, len, off,	\
> -						bvec, (K))	\
> -			i->nr_segs -= bvec - i->bvec;		\
> -			i->bvec = bvec;				\
> -		} else if (iov_iter_is_kvec(i)) {		\
> -			const struct kvec *kvec = i->kvec;	\
> -			void *base;				\
> -			size_t len;				\
> -			iterate_iovec(i, n, base, len, off,	\
> -						kvec, (K))	\
> -			i->nr_segs -= kvec - i->kvec;		\
> -			i->kvec = kvec;				\
> -		} else if (iov_iter_is_xarray(i)) {		\
> -			void *base;				\
> -			size_t len;				\
> -			iterate_xarray(i, n, base, len, off,	\
> -							(K))	\
> -		}						\
> -		i->count -= n;					\
> -	}							\
> -}
> -#define iterate_and_advance(i, n, base, len, off, I, K) \
> -	__iterate_and_advance(i, n, base, len, off, I, ((void)(K),0))
> -
> -static int copyout(void __user *to, const void *from, size_t n)
> +static __always_inline
> +size_t copy_to_user_iter(void __user *iter_to, size_t progress,
> +			 size_t len, void *from, void *priv2)
>   {
>   	if (should_fail_usercopy())
> -		return n;
> -	if (access_ok(to, n)) {
> -		instrument_copy_to_user(to, from, n);
> -		n = raw_copy_to_user(to, from, n);
> +		return len;
> +	if (access_ok(iter_to, len)) {
> +		from += progress;
> +		instrument_copy_to_user(iter_to, from, len);
> +		len = raw_copy_to_user(iter_to, from, len);
>   	}
> -	return n;
> +	return len;
>   }
>   
> -static int copyout_nofault(void __user *to, const void *from, size_t n)
> +static __always_inline
> +size_t copy_to_user_iter_nofault(void __user *iter_to, size_t progress,
> +				 size_t len, void *from, void *priv2)
>   {
> -	long res;
> +	ssize_t res;
>   
>   	if (should_fail_usercopy())
> -		return n;
> -
> -	res = copy_to_user_nofault(to, from, n);
> +		return len;
>   
> -	return res < 0 ? n : res;
> +	from += progress;
> +	res = copy_to_user_nofault(iter_to, from, len);
> +	return res < 0 ? len : res;
>   }
>   
> -static int copyin(void *to, const void __user *from, size_t n)
> +static __always_inline
> +size_t copy_from_user_iter(void __user *iter_from, size_t progress,
> +			   size_t len, void *to, void *priv2)
>   {
> -	size_t res = n;
> +	size_t res = len;
>   
>   	if (should_fail_usercopy())
> -		return n;
> -	if (access_ok(from, n)) {
> -		instrument_copy_from_user_before(to, from, n);
> -		res = raw_copy_from_user(to, from, n);
> -		instrument_copy_from_user_after(to, from, n, res);
> +		return len;
> +	if (access_ok(iter_from, len)) {
> +		to += progress;
> +		instrument_copy_from_user_before(to, iter_from, len);
> +		res = raw_copy_from_user(to, iter_from, len);
> +		instrument_copy_from_user_after(to, iter_from, len, res);
>   	}
>   	return res;
>   }
>   
> +static __always_inline
> +size_t memcpy_to_iter(void *iter_to, size_t progress,
> +		      size_t len, void *from, void *priv2)
> +{
> +	memcpy(iter_to, from + progress, len);
> +	return 0;
> +}
> +
> +static __always_inline
> +size_t memcpy_from_iter(void *iter_from, size_t progress,
> +			size_t len, void *to, void *priv2)
> +{
> +	memcpy(to + progress, iter_from, len);
> +	return 0;
> +}
> +
>   /*
>    * fault_in_iov_iter_readable - fault in iov iterator for reading
>    * @i: iterator
> @@ -312,23 +192,29 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
>   		return 0;
>   	if (user_backed_iter(i))
>   		might_fault();
> -	iterate_and_advance(i, bytes, base, len, off,
> -		copyout(base, addr + off, len),
> -		memcpy(base, addr + off, len)
> -	)
> -
> -	return bytes;
> +	return iterate_and_advance(i, bytes, (void *)addr,
> +				   copy_to_user_iter, memcpy_to_iter);
>   }
>   EXPORT_SYMBOL(_copy_to_iter);
>   
>   #ifdef CONFIG_ARCH_HAS_COPY_MC
> -static int copyout_mc(void __user *to, const void *from, size_t n)
> -{
> -	if (access_ok(to, n)) {
> -		instrument_copy_to_user(to, from, n);
> -		n = copy_mc_to_user((__force void *) to, from, n);
> +static __always_inline
> +size_t copy_to_user_iter_mc(void __user *iter_to, size_t progress,
> +			    size_t len, void *from, void *priv2)
> +{
> +	if (access_ok(iter_to, len)) {
> +		from += progress;
> +		instrument_copy_to_user(iter_to, from, len);
> +		len = copy_mc_to_user(iter_to, from, len);
>   	}
> -	return n;
> +	return len;
> +}
> +
> +static __always_inline
> +size_t memcpy_to_iter_mc(void *iter_to, size_t progress,
> +			 size_t len, void *from, void *priv2)
> +{
> +	return copy_mc_to_kernel(iter_to, from + progress, len);
>   }
>   
>   /**
> @@ -361,22 +247,20 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
>   		return 0;
>   	if (user_backed_iter(i))
>   		might_fault();
> -	__iterate_and_advance(i, bytes, base, len, off,
> -		copyout_mc(base, addr + off, len),
> -		copy_mc_to_kernel(base, addr + off, len)
> -	)
> -
> -	return bytes;
> +	return iterate_and_advance(i, bytes, (void *)addr,
> +				   copy_to_user_iter_mc, memcpy_to_iter_mc);
>   }
>   EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
>   #endif /* CONFIG_ARCH_HAS_COPY_MC */
>   
> -static void *memcpy_from_iter(struct iov_iter *i, void *to, const void *from,
> -				 size_t size)
> +static size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
> +				  size_t len, void *to, void *priv2)
>   {
> -	if (iov_iter_is_copy_mc(i))
> -		return (void *)copy_mc_to_kernel(to, from, size);
> -	return memcpy(to, from, size);
> +	struct iov_iter *iter = priv2;
> +
> +	if (iov_iter_is_copy_mc(iter))
> +		return copy_mc_to_kernel(to + progress, iter_from, len);
> +	return memcpy_from_iter(iter_from, progress, len, to, priv2);
>   }
>   
>   size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
> @@ -386,30 +270,46 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
>   
>   	if (user_backed_iter(i))
>   		might_fault();
> -	iterate_and_advance(i, bytes, base, len, off,
> -		copyin(addr + off, base, len),
> -		memcpy_from_iter(i, addr + off, base, len)
> -	)
> -
> -	return bytes;
> +	return iterate_and_advance2(i, bytes, addr, i,
> +				    copy_from_user_iter,
> +				    memcpy_from_iter_mc);
>   }
>   EXPORT_SYMBOL(_copy_from_iter);
>   
> +static __always_inline
> +size_t copy_from_user_iter_nocache(void __user *iter_from, size_t progress,
> +				   size_t len, void *to, void *priv2)
> +{
> +	return __copy_from_user_inatomic_nocache(to + progress, iter_from, len);
> +}
> +
>   size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
>   {
>   	if (WARN_ON_ONCE(!i->data_source))
>   		return 0;
>   
> -	iterate_and_advance(i, bytes, base, len, off,
> -		__copy_from_user_inatomic_nocache(addr + off, base, len),
> -		memcpy(addr + off, base, len)
> -	)
> -
> -	return bytes;
> +	return iterate_and_advance(i, bytes, addr,
> +				   copy_from_user_iter_nocache,
> +				   memcpy_from_iter);
>   }
>   EXPORT_SYMBOL(_copy_from_iter_nocache);
>   
>   #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +static __always_inline
> +size_t copy_from_user_iter_flushcache(void __user *iter_from, size_t progress,
> +				      size_t len, void *to, void *priv2)
> +{
> +	return __copy_from_user_flushcache(to + progress, iter_from, len);
> +}
> +
> +static __always_inline
> +size_t memcpy_from_iter_flushcache(void *iter_from, size_t progress,
> +				   size_t len, void *to, void *priv2)
> +{
> +	memcpy_flushcache(to + progress, iter_from, len);
> +	return 0;
> +}
> +
>   /**
>    * _copy_from_iter_flushcache - write destination through cpu cache
>    * @addr: destination kernel address
> @@ -431,12 +331,9 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
>   	if (WARN_ON_ONCE(!i->data_source))
>   		return 0;
>   
> -	iterate_and_advance(i, bytes, base, len, off,
> -		__copy_from_user_flushcache(addr + off, base, len),
> -		memcpy_flushcache(addr + off, base, len)
> -	)
> -
> -	return bytes;
> +	return iterate_and_advance(i, bytes, addr,
> +				   copy_from_user_iter_flushcache,
> +				   memcpy_from_iter_flushcache);
>   }
>   EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
>   #endif
> @@ -508,10 +405,9 @@ size_t copy_page_to_iter_nofault(struct page *page, unsigned offset, size_t byte
>   		void *kaddr = kmap_local_page(page);
>   		size_t n = min(bytes, (size_t)PAGE_SIZE - offset);
>   
> -		iterate_and_advance(i, n, base, len, off,
> -			copyout_nofault(base, kaddr + offset + off, len),
> -			memcpy(base, kaddr + offset + off, len)
> -		)
> +		n = iterate_and_advance(i, bytes, kaddr,
> +					copy_to_user_iter_nofault,
> +					memcpy_to_iter);
>   		kunmap_local(kaddr);
>   		res += n;
>   		bytes -= n;
> @@ -554,14 +450,25 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
>   }
>   EXPORT_SYMBOL(copy_page_from_iter);
>   
> -size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
> +static __always_inline
> +size_t zero_to_user_iter(void __user *iter_to, size_t progress,
> +			 size_t len, void *priv, void *priv2)
>   {
> -	iterate_and_advance(i, bytes, base, len, count,
> -		clear_user(base, len),
> -		memset(base, 0, len)
> -	)
> +	return clear_user(iter_to, len);
> +}
>   
> -	return bytes;
> +static __always_inline
> +size_t zero_to_iter(void *iter_to, size_t progress,
> +		    size_t len, void *priv, void *priv2)
> +{
> +	memset(iter_to, 0, len);
> +	return 0;
> +}
> +
> +size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
> +{
> +	return iterate_and_advance(i, bytes, NULL,
> +				   zero_to_user_iter, zero_to_iter);
>   }
>   EXPORT_SYMBOL(iov_iter_zero);
>   
> @@ -586,10 +493,9 @@ size_t copy_page_from_iter_atomic(struct page *page, size_t offset,
>   		}
>   
>   		p = kmap_atomic(page) + offset;
> -		iterate_and_advance(i, n, base, len, off,
> -			copyin(p + off, base, len),
> -			memcpy_from_iter(i, p + off, base, len)
> -		)
> +		n = iterate_and_advance2(i, n, p, i,
> +					 copy_from_user_iter,
> +					 memcpy_from_iter_mc);
>   		kunmap_atomic(p);
>   		copied += n;
>   		offset += n;
> @@ -1180,32 +1086,64 @@ ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i,
>   }
>   EXPORT_SYMBOL(iov_iter_get_pages_alloc2);
>   
> +static __always_inline
> +size_t copy_from_user_iter_csum(void __user *iter_from, size_t progress,
> +				size_t len, void *to, void *priv2)
> +{
> +	__wsum next, *csum = priv2;
> +
> +	next = csum_and_copy_from_user(iter_from, to + progress, len);
> +	*csum = csum_block_add(*csum, next, progress);
> +	return next ? 0 : len;
> +}
> +
> +static __always_inline
> +size_t memcpy_from_iter_csum(void *iter_from, size_t progress,
> +			     size_t len, void *to, void *priv2)
> +{
> +	__wsum *csum = priv2;
> +
> +	*csum = csum_and_memcpy(to + progress, iter_from, len, *csum, progress);
> +	return 0;
> +}
> +
>   size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
>   			       struct iov_iter *i)
>   {
> -	__wsum sum, next;
> -	sum = *csum;
>   	if (WARN_ON_ONCE(!i->data_source))
>   		return 0;
> -
> -	iterate_and_advance(i, bytes, base, len, off, ({
> -		next = csum_and_copy_from_user(base, addr + off, len);
> -		sum = csum_block_add(sum, next, off);
> -		next ? 0 : len;
> -	}), ({
> -		sum = csum_and_memcpy(addr + off, base, len, sum, off);
> -	})
> -	)
> -	*csum = sum;
> -	return bytes;
> +	return iterate_and_advance2(i, bytes, addr, csum,
> +				    copy_from_user_iter_csum,
> +				    memcpy_from_iter_csum);
>   }
>   EXPORT_SYMBOL(csum_and_copy_from_iter);
>   
> +static __always_inline
> +size_t copy_to_user_iter_csum(void __user *iter_to, size_t progress,
> +			      size_t len, void *from, void *priv2)
> +{
> +	__wsum next, *csum = priv2;
> +
> +	next = csum_and_copy_to_user(from + progress, iter_to, len);
> +	*csum = csum_block_add(*csum, next, progress);
> +	return next ? 0 : len;
> +}
> +
> +static __always_inline
> +size_t memcpy_to_iter_csum(void *iter_to, size_t progress,
> +			   size_t len, void *from, void *priv2)
> +{
> +	__wsum *csum = priv2;
> +
> +	*csum = csum_and_memcpy(iter_to, from + progress, len, *csum, progress);
> +	return 0;
> +}
> +
>   size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
>   			     struct iov_iter *i)
>   {
>   	struct csum_state *csstate = _csstate;
> -	__wsum sum, next;
> +	__wsum sum;
>   
>   	if (WARN_ON_ONCE(i->data_source))
>   		return 0;
> @@ -1219,14 +1157,10 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
>   	}
>   
>   	sum = csum_shift(csstate->csum, csstate->off);
> -	iterate_and_advance(i, bytes, base, len, off, ({
> -		next = csum_and_copy_to_user(addr + off, base, len);
> -		sum = csum_block_add(sum, next, off);
> -		next ? 0 : len;
> -	}), ({
> -		sum = csum_and_memcpy(base, addr + off, len, sum, off);
> -	})
> -	)
> +	
> +	bytes = iterate_and_advance2(i, bytes, (void *)addr, &sum,
> +				     copy_to_user_iter_csum,
> +				     memcpy_to_iter_csum);
>   	csstate->csum = csum_shift(sum, csstate->off);
>   	csstate->off += bytes;
>   	return bytes;
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ