[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4e80924d-9c85-f13a-722a-6a5d2b1c225a@huawei.com>
Date: Sun, 18 Feb 2024 11:13:30 +0800
From: Tong Tiangen <tongtiangen@...wei.com>
To: David Howells <dhowells@...hat.com>, Jens Axboe <axboe@...nel.dk>
CC: Al Viro <viro@...iv.linux.org.uk>, Linus Torvalds
<torvalds@...ux-foundation.org>, Christoph Hellwig <hch@....de>, Christian
Brauner <christian@...uner.io>, David Laight <David.Laight@...LAB.COM>,
Matthew Wilcox <willy@...radead.org>, Jeff Layton <jlayton@...nel.org>,
<linux-fsdevel@...r.kernel.org>, <linux-block@...r.kernel.org>,
<linux-mm@...ck.org>, <netdev@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, Kefeng Wang <wangkefeng.wang@...wei.com>
Subject: [bug report] dead loop in generic_perform_write() //Re: [PATCH v7
07/12] iov_iter: Convert iterate*() to inline funcs
Hi David, Jens:
Recently, I tested the x86 coredump function of the user process in the
mainline (6.8-rc1) and found an deadloop issue related to this patch.
Let's discuss it.
1. Test step:
----------------------------
a. Start a user process.
b. Use EINJ to inject a hardware memory error into a page of
the this user process.
c. Send SIGBUS to this user process.
d. After receiving the signal, a coredump file is configured to be
written to tmpfs.
2. Root cause:
----------------------------
The deadloop occurs in generic_perform_write(), the call path:
elf_core_dump()
-> dump_user_range()
-> dump_emit_page()
-> iov_iter_bvec() //iter type set to BVEC
-> iov_iter_set_copy_mc(&iter); //support copy mc
-> __kernel_write_iter()
-> shmem_file_write_iter()
-> generic_perform_write()
ssize_t generic_perform_write(...)
{
[...]
do {
[...]
again:
//[4]
if (unlikely(fault_in_iov_iter_readable(i, bytes) ==
bytes)) {
status = -EFAULT;
break;
}
//[5]
if (fatal_signal_pending(current)) {
status = -EINTR;
break;
}
[...]
//[1]
copied = copy_page_from_iter_atomic(page, offset, bytes,
i);
[...]
//[2]
status = a_ops->write_end(...);
if (unlikely(status != copied)) {
iov_iter_revert(i, copied - max(status, 0L));
if (unlikely(status < 0))
break;
}
cond_resched();
if (unlikely(status == 0)) {
/*
* A short copy made ->write_end() reject the
* thing entirely. Might be memory poisoning
* halfway through, might be a race with munmap,
* might be severe memory pressure.
*/
if (copied)
bytes = copied;
//----[3]
goto again;
}
[...]
} while (iov_iter_count(i));
[...]
}
[1]Before this patch:
copy_page_from_iter_atomic()
-> iterate_and_advance()
-> __iterate_and_advance(..., ((void)(K),0))
->iterate_bvec macro
-> left = ((void)(K),0)
With CONFIG_ARCH_HAS_COPY_MC, the K() is copy_mc_to_kernel() which
return "bytes not copied".
When a memory error occurs during K(), the value of "left" must be 0.
Therefore, the value of "copied" returned by
copy_page_from_iter_atomic() is not 0, and the loop of
generic_perform_write() can be ended normally.
After this patch:
copy_page_from_iter_atomic()
-> iterate_and_advance2()
-> iterate_bvec()
-> remain = step()
With CONFIG_ARCH_HAS_COPY_MC, the step() is copy_mc_to_kernel() which
return "bytes not copied".
When a memory error occurs during step(), the value of "left" equal to
the value of "part" (no one byte is copied successfully). In this case,
iterate_bvec() returns 0, and copy_page_from_iter_atomic() also returns
0. The callback shmem_write_end()[2] also returns 0. Finally,
generic_perform_write() goes to "goto again"[3], and the loop restarts.
4][5] cannot enter and exit the loop, then deadloop occurs.
Thanks.
Tong
在 2023/9/25 20:03, David Howells 写道:
> Convert the iov_iter iteration macros to inline functions to make the code
> easier to follow.
>
> The functions are marked __always_inline as we don't want to end up with
> indirect calls in the code. This, however, leaves dealing with ->copy_mc
> in an awkard situation since the step function (memcpy_from_iter_mc())
> needs to test the flag in the iterator, but isn't passed the iterator.
> This will be dealt with in a follow-up patch.
>
> The variable names in the per-type iterator functions have been harmonised
> as much as possible and made clearer as to the variable purpose.
>
> The iterator functions are also moved to a header file so that other
> operations that need to scan over an iterator can be added. For instance,
> the rbd driver could use this to scan a buffer to see if it is all zeros
> and libceph could use this to generate a crc.
>
> Signed-off-by: David Howells <dhowells@...hat.com>
> cc: Alexander Viro <viro@...iv.linux.org.uk>
> cc: Jens Axboe <axboe@...nel.dk>
> cc: Christoph Hellwig <hch@....de>
> cc: Christian Brauner <christian@...uner.io>
> cc: Matthew Wilcox <willy@...radead.org>
> cc: Linus Torvalds <torvalds@...ux-foundation.org>
> cc: David Laight <David.Laight@...LAB.COM>
> cc: linux-block@...r.kernel.org
> cc: linux-fsdevel@...r.kernel.org
> cc: linux-mm@...ck.org
> Link: https://lore.kernel.org/r/3710261.1691764329@warthog.procyon.org.uk/ # v1
> Link: https://lore.kernel.org/r/855.1692047347@warthog.procyon.org.uk/ # v2
> Link: https://lore.kernel.org/r/20230816120741.534415-1-dhowells@redhat.com/ # v3
> ---
>
> Notes:
> Changes
> =======
> ver #5)
> - Merge in patch to move iteration framework to a header file.
> - Move "iter->count - progress" into individual iteration subfunctions.
>
> include/linux/iov_iter.h | 274 ++++++++++++++++++++++++++
> lib/iov_iter.c | 416 ++++++++++++++++-----------------------
> 2 files changed, 449 insertions(+), 241 deletions(-)
> create mode 100644 include/linux/iov_iter.h
>
> diff --git a/include/linux/iov_iter.h b/include/linux/iov_iter.h
> new file mode 100644
> index 000000000000..270454a6703d
> --- /dev/null
> +++ b/include/linux/iov_iter.h
> @@ -0,0 +1,274 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/* I/O iterator iteration building functions.
> + *
> + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
> + * Written by David Howells (dhowells@...hat.com)
> + */
> +
> +#ifndef _LINUX_IOV_ITER_H
> +#define _LINUX_IOV_ITER_H
> +
> +#include <linux/uio.h>
> +#include <linux/bvec.h>
> +
> +typedef size_t (*iov_step_f)(void *iter_base, size_t progress, size_t len,
> + void *priv, void *priv2);
> +typedef size_t (*iov_ustep_f)(void __user *iter_base, size_t progress, size_t len,
> + void *priv, void *priv2);
> +
> +/*
> + * Handle ITER_UBUF.
> + */
> +static __always_inline
> +size_t iterate_ubuf(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> + iov_ustep_f step)
> +{
> + void __user *base = iter->ubuf;
> + size_t progress = 0, remain;
> +
> + remain = step(base + iter->iov_offset, 0, len, priv, priv2);
> + progress = len - remain;
> + iter->iov_offset += progress;
> + iter->count -= progress;
> + return progress;
> +}
> +
> +/*
> + * Handle ITER_IOVEC.
> + */
> +static __always_inline
> +size_t iterate_iovec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> + iov_ustep_f step)
> +{
> + const struct iovec *p = iter->__iov;
> + size_t progress = 0, skip = iter->iov_offset;
> +
> + do {
> + size_t remain, consumed;
> + size_t part = min(len, p->iov_len - skip);
> +
> + if (likely(part)) {
> + remain = step(p->iov_base + skip, progress, part, priv, priv2);
> + consumed = part - remain;
> + progress += consumed;
> + skip += consumed;
> + len -= consumed;
> + if (skip < p->iov_len)
> + break;
> + }
> + p++;
> + skip = 0;
> + } while (len);
> +
> + iter->nr_segs -= p - iter->__iov;
> + iter->__iov = p;
> + iter->iov_offset = skip;
> + iter->count -= progress;
> + return progress;
> +}
> +
> +/*
> + * Handle ITER_KVEC.
> + */
> +static __always_inline
> +size_t iterate_kvec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> + iov_step_f step)
> +{
> + const struct kvec *p = iter->kvec;
> + size_t progress = 0, skip = iter->iov_offset;
> +
> + do {
> + size_t remain, consumed;
> + size_t part = min(len, p->iov_len - skip);
> +
> + if (likely(part)) {
> + remain = step(p->iov_base + skip, progress, part, priv, priv2);
> + consumed = part - remain;
> + progress += consumed;
> + skip += consumed;
> + len -= consumed;
> + if (skip < p->iov_len)
> + break;
> + }
> + p++;
> + skip = 0;
> + } while (len);
> +
> + iter->nr_segs -= p - iter->kvec;
> + iter->kvec = p;
> + iter->iov_offset = skip;
> + iter->count -= progress;
> + return progress;
> +}
> +
> +/*
> + * Handle ITER_BVEC.
> + */
> +static __always_inline
> +size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> + iov_step_f step)
> +{
> + const struct bio_vec *p = iter->bvec;
> + size_t progress = 0, skip = iter->iov_offset;
> +
> + do {
> + size_t remain, consumed;
> + size_t offset = p->bv_offset + skip, part;
> + void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE);
> +
> + part = min3(len,
> + (size_t)(p->bv_len - skip),
> + (size_t)(PAGE_SIZE - offset % PAGE_SIZE));
> + remain = step(kaddr + offset % PAGE_SIZE, progress, part, priv, priv2);
> + kunmap_local(kaddr);
> + consumed = part - remain;
> + len -= consumed;
> + progress += consumed;
> + skip += consumed;
> + if (skip >= p->bv_len) {
> + skip = 0;
> + p++;
> + }
> + if (remain)
> + break;
> + } while (len);
> +
> + iter->nr_segs -= p - iter->bvec;
> + iter->bvec = p;
> + iter->iov_offset = skip;
> + iter->count -= progress;
> + return progress;
> +}
> +
> +/*
> + * Handle ITER_XARRAY.
> + */
> +static __always_inline
> +size_t iterate_xarray(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> + iov_step_f step)
> +{
> + struct folio *folio;
> + size_t progress = 0;
> + loff_t start = iter->xarray_start + iter->iov_offset;
> + pgoff_t index = start / PAGE_SIZE;
> + XA_STATE(xas, iter->xarray, index);
> +
> + rcu_read_lock();
> + xas_for_each(&xas, folio, ULONG_MAX) {
> + size_t remain, consumed, offset, part, flen;
> +
> + if (xas_retry(&xas, folio))
> + continue;
> + if (WARN_ON(xa_is_value(folio)))
> + break;
> + if (WARN_ON(folio_test_hugetlb(folio)))
> + break;
> +
> + offset = offset_in_folio(folio, start + progress);
> + flen = min(folio_size(folio) - offset, len);
> +
> + while (flen) {
> + void *base = kmap_local_folio(folio, offset);
> +
> + part = min_t(size_t, flen,
> + PAGE_SIZE - offset_in_page(offset));
> + remain = step(base, progress, part, priv, priv2);
> + kunmap_local(base);
> +
> + consumed = part - remain;
> + progress += consumed;
> + len -= consumed;
> +
> + if (remain || len == 0)
> + goto out;
> + flen -= consumed;
> + offset += consumed;
> + }
> + }
> +
> +out:
> + rcu_read_unlock();
> + iter->iov_offset += progress;
> + iter->count -= progress;
> + return progress;
> +}
> +
> +/*
> + * Handle ITER_DISCARD.
> + */
> +static __always_inline
> +size_t iterate_discard(struct iov_iter *iter, size_t len, void *priv, void *priv2,
> + iov_step_f step)
> +{
> + size_t progress = len;
> +
> + iter->count -= progress;
> + return progress;
> +}
> +
> +/**
> + * iterate_and_advance2 - Iterate over an iterator
> + * @iter: The iterator to iterate over.
> + * @len: The amount to iterate over.
> + * @priv: Data for the step functions.
> + * @priv2: More data for the step functions.
> + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
> + * @step: Function for other iterators; given kernel addresses.
> + *
> + * Iterate over the next part of an iterator, up to the specified length. The
> + * buffer is presented in segments, which for kernel iteration are broken up by
> + * physical pages and mapped, with the mapped address being presented.
> + *
> + * Two step functions, @step and @ustep, must be provided, one for handling
> + * mapped kernel addresses and the other is given user addresses which have the
> + * potential to fault since no pinning is performed.
> + *
> + * The step functions are passed the address and length of the segment, @priv,
> + * @priv2 and the amount of data so far iterated over (which can, for example,
> + * be added to @priv to point to the right part of a second buffer). The step
> + * functions should return the amount of the segment they didn't process (ie. 0
> + * indicates complete processsing).
> + *
> + * This function returns the amount of data processed (ie. 0 means nothing was
> + * processed and the value of @len means processes to completion).
> + */
> +static __always_inline
> +size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void *priv,
> + void *priv2, iov_ustep_f ustep, iov_step_f step)
> +{
> + if (unlikely(iter->count < len))
> + len = iter->count;
> + if (unlikely(!len))
> + return 0;
> +
> + if (likely(iter_is_ubuf(iter)))
> + return iterate_ubuf(iter, len, priv, priv2, ustep);
> + if (likely(iter_is_iovec(iter)))
> + return iterate_iovec(iter, len, priv, priv2, ustep);
> + if (iov_iter_is_bvec(iter))
> + return iterate_bvec(iter, len, priv, priv2, step);
> + if (iov_iter_is_kvec(iter))
> + return iterate_kvec(iter, len, priv, priv2, step);
> + if (iov_iter_is_xarray(iter))
> + return iterate_xarray(iter, len, priv, priv2, step);
> + return iterate_discard(iter, len, priv, priv2, step);
> +}
> +
> +/**
> + * iterate_and_advance - Iterate over an iterator
> + * @iter: The iterator to iterate over.
> + * @len: The amount to iterate over.
> + * @priv: Data for the step functions.
> + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
> + * @step: Function for other iterators; given kernel addresses.
> + *
> + * As iterate_and_advance2(), but priv2 is always NULL.
> + */
> +static __always_inline
> +size_t iterate_and_advance(struct iov_iter *iter, size_t len, void *priv,
> + iov_ustep_f ustep, iov_step_f step)
> +{
> + return iterate_and_advance2(iter, len, priv, NULL, ustep, step);
> +}
> +
> +#endif /* _LINUX_IOV_ITER_H */
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 227c9f536b94..65374ee91ecd 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -13,189 +13,69 @@
> #include <net/checksum.h>
> #include <linux/scatterlist.h>
> #include <linux/instrumented.h>
> +#include <linux/iov_iter.h>
>
> -/* covers ubuf and kbuf alike */
> -#define iterate_buf(i, n, base, len, off, __p, STEP) { \
> - size_t __maybe_unused off = 0; \
> - len = n; \
> - base = __p + i->iov_offset; \
> - len -= (STEP); \
> - i->iov_offset += len; \
> - n = len; \
> -}
> -
> -/* covers iovec and kvec alike */
> -#define iterate_iovec(i, n, base, len, off, __p, STEP) { \
> - size_t off = 0; \
> - size_t skip = i->iov_offset; \
> - do { \
> - len = min(n, __p->iov_len - skip); \
> - if (likely(len)) { \
> - base = __p->iov_base + skip; \
> - len -= (STEP); \
> - off += len; \
> - skip += len; \
> - n -= len; \
> - if (skip < __p->iov_len) \
> - break; \
> - } \
> - __p++; \
> - skip = 0; \
> - } while (n); \
> - i->iov_offset = skip; \
> - n = off; \
> -}
> -
> -#define iterate_bvec(i, n, base, len, off, p, STEP) { \
> - size_t off = 0; \
> - unsigned skip = i->iov_offset; \
> - while (n) { \
> - unsigned offset = p->bv_offset + skip; \
> - unsigned left; \
> - void *kaddr = kmap_local_page(p->bv_page + \
> - offset / PAGE_SIZE); \
> - base = kaddr + offset % PAGE_SIZE; \
> - len = min(min(n, (size_t)(p->bv_len - skip)), \
> - (size_t)(PAGE_SIZE - offset % PAGE_SIZE)); \
> - left = (STEP); \
> - kunmap_local(kaddr); \
> - len -= left; \
> - off += len; \
> - skip += len; \
> - if (skip == p->bv_len) { \
> - skip = 0; \
> - p++; \
> - } \
> - n -= len; \
> - if (left) \
> - break; \
> - } \
> - i->iov_offset = skip; \
> - n = off; \
> -}
> -
> -#define iterate_xarray(i, n, base, len, __off, STEP) { \
> - __label__ __out; \
> - size_t __off = 0; \
> - struct folio *folio; \
> - loff_t start = i->xarray_start + i->iov_offset; \
> - pgoff_t index = start / PAGE_SIZE; \
> - XA_STATE(xas, i->xarray, index); \
> - \
> - len = PAGE_SIZE - offset_in_page(start); \
> - rcu_read_lock(); \
> - xas_for_each(&xas, folio, ULONG_MAX) { \
> - unsigned left; \
> - size_t offset; \
> - if (xas_retry(&xas, folio)) \
> - continue; \
> - if (WARN_ON(xa_is_value(folio))) \
> - break; \
> - if (WARN_ON(folio_test_hugetlb(folio))) \
> - break; \
> - offset = offset_in_folio(folio, start + __off); \
> - while (offset < folio_size(folio)) { \
> - base = kmap_local_folio(folio, offset); \
> - len = min(n, len); \
> - left = (STEP); \
> - kunmap_local(base); \
> - len -= left; \
> - __off += len; \
> - n -= len; \
> - if (left || n == 0) \
> - goto __out; \
> - offset += len; \
> - len = PAGE_SIZE; \
> - } \
> - } \
> -__out: \
> - rcu_read_unlock(); \
> - i->iov_offset += __off; \
> - n = __off; \
> -}
> -
> -#define __iterate_and_advance(i, n, base, len, off, I, K) { \
> - if (unlikely(i->count < n)) \
> - n = i->count; \
> - if (likely(n)) { \
> - if (likely(iter_is_ubuf(i))) { \
> - void __user *base; \
> - size_t len; \
> - iterate_buf(i, n, base, len, off, \
> - i->ubuf, (I)) \
> - } else if (likely(iter_is_iovec(i))) { \
> - const struct iovec *iov = iter_iov(i); \
> - void __user *base; \
> - size_t len; \
> - iterate_iovec(i, n, base, len, off, \
> - iov, (I)) \
> - i->nr_segs -= iov - iter_iov(i); \
> - i->__iov = iov; \
> - } else if (iov_iter_is_bvec(i)) { \
> - const struct bio_vec *bvec = i->bvec; \
> - void *base; \
> - size_t len; \
> - iterate_bvec(i, n, base, len, off, \
> - bvec, (K)) \
> - i->nr_segs -= bvec - i->bvec; \
> - i->bvec = bvec; \
> - } else if (iov_iter_is_kvec(i)) { \
> - const struct kvec *kvec = i->kvec; \
> - void *base; \
> - size_t len; \
> - iterate_iovec(i, n, base, len, off, \
> - kvec, (K)) \
> - i->nr_segs -= kvec - i->kvec; \
> - i->kvec = kvec; \
> - } else if (iov_iter_is_xarray(i)) { \
> - void *base; \
> - size_t len; \
> - iterate_xarray(i, n, base, len, off, \
> - (K)) \
> - } \
> - i->count -= n; \
> - } \
> -}
> -#define iterate_and_advance(i, n, base, len, off, I, K) \
> - __iterate_and_advance(i, n, base, len, off, I, ((void)(K),0))
> -
> -static int copyout(void __user *to, const void *from, size_t n)
> +static __always_inline
> +size_t copy_to_user_iter(void __user *iter_to, size_t progress,
> + size_t len, void *from, void *priv2)
> {
> if (should_fail_usercopy())
> - return n;
> - if (access_ok(to, n)) {
> - instrument_copy_to_user(to, from, n);
> - n = raw_copy_to_user(to, from, n);
> + return len;
> + if (access_ok(iter_to, len)) {
> + from += progress;
> + instrument_copy_to_user(iter_to, from, len);
> + len = raw_copy_to_user(iter_to, from, len);
> }
> - return n;
> + return len;
> }
>
> -static int copyout_nofault(void __user *to, const void *from, size_t n)
> +static __always_inline
> +size_t copy_to_user_iter_nofault(void __user *iter_to, size_t progress,
> + size_t len, void *from, void *priv2)
> {
> - long res;
> + ssize_t res;
>
> if (should_fail_usercopy())
> - return n;
> -
> - res = copy_to_user_nofault(to, from, n);
> + return len;
>
> - return res < 0 ? n : res;
> + from += progress;
> + res = copy_to_user_nofault(iter_to, from, len);
> + return res < 0 ? len : res;
> }
>
> -static int copyin(void *to, const void __user *from, size_t n)
> +static __always_inline
> +size_t copy_from_user_iter(void __user *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> {
> - size_t res = n;
> + size_t res = len;
>
> if (should_fail_usercopy())
> - return n;
> - if (access_ok(from, n)) {
> - instrument_copy_from_user_before(to, from, n);
> - res = raw_copy_from_user(to, from, n);
> - instrument_copy_from_user_after(to, from, n, res);
> + return len;
> + if (access_ok(iter_from, len)) {
> + to += progress;
> + instrument_copy_from_user_before(to, iter_from, len);
> + res = raw_copy_from_user(to, iter_from, len);
> + instrument_copy_from_user_after(to, iter_from, len, res);
> }
> return res;
> }
>
> +static __always_inline
> +size_t memcpy_to_iter(void *iter_to, size_t progress,
> + size_t len, void *from, void *priv2)
> +{
> + memcpy(iter_to, from + progress, len);
> + return 0;
> +}
> +
> +static __always_inline
> +size_t memcpy_from_iter(void *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> +{
> + memcpy(to + progress, iter_from, len);
> + return 0;
> +}
> +
> /*
> * fault_in_iov_iter_readable - fault in iov iterator for reading
> * @i: iterator
> @@ -312,23 +192,29 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> return 0;
> if (user_backed_iter(i))
> might_fault();
> - iterate_and_advance(i, bytes, base, len, off,
> - copyout(base, addr + off, len),
> - memcpy(base, addr + off, len)
> - )
> -
> - return bytes;
> + return iterate_and_advance(i, bytes, (void *)addr,
> + copy_to_user_iter, memcpy_to_iter);
> }
> EXPORT_SYMBOL(_copy_to_iter);
>
> #ifdef CONFIG_ARCH_HAS_COPY_MC
> -static int copyout_mc(void __user *to, const void *from, size_t n)
> -{
> - if (access_ok(to, n)) {
> - instrument_copy_to_user(to, from, n);
> - n = copy_mc_to_user((__force void *) to, from, n);
> +static __always_inline
> +size_t copy_to_user_iter_mc(void __user *iter_to, size_t progress,
> + size_t len, void *from, void *priv2)
> +{
> + if (access_ok(iter_to, len)) {
> + from += progress;
> + instrument_copy_to_user(iter_to, from, len);
> + len = copy_mc_to_user(iter_to, from, len);
> }
> - return n;
> + return len;
> +}
> +
> +static __always_inline
> +size_t memcpy_to_iter_mc(void *iter_to, size_t progress,
> + size_t len, void *from, void *priv2)
> +{
> + return copy_mc_to_kernel(iter_to, from + progress, len);
> }
>
> /**
> @@ -361,22 +247,20 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> return 0;
> if (user_backed_iter(i))
> might_fault();
> - __iterate_and_advance(i, bytes, base, len, off,
> - copyout_mc(base, addr + off, len),
> - copy_mc_to_kernel(base, addr + off, len)
> - )
> -
> - return bytes;
> + return iterate_and_advance(i, bytes, (void *)addr,
> + copy_to_user_iter_mc, memcpy_to_iter_mc);
> }
> EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
> #endif /* CONFIG_ARCH_HAS_COPY_MC */
>
> -static void *memcpy_from_iter(struct iov_iter *i, void *to, const void *from,
> - size_t size)
> +static size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> {
> - if (iov_iter_is_copy_mc(i))
> - return (void *)copy_mc_to_kernel(to, from, size);
> - return memcpy(to, from, size);
> + struct iov_iter *iter = priv2;
> +
> + if (iov_iter_is_copy_mc(iter))
> + return copy_mc_to_kernel(to + progress, iter_from, len);
> + return memcpy_from_iter(iter_from, progress, len, to, priv2);
> }
>
> size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
> @@ -386,30 +270,46 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
>
> if (user_backed_iter(i))
> might_fault();
> - iterate_and_advance(i, bytes, base, len, off,
> - copyin(addr + off, base, len),
> - memcpy_from_iter(i, addr + off, base, len)
> - )
> -
> - return bytes;
> + return iterate_and_advance2(i, bytes, addr, i,
> + copy_from_user_iter,
> + memcpy_from_iter_mc);
> }
> EXPORT_SYMBOL(_copy_from_iter);
>
> +static __always_inline
> +size_t copy_from_user_iter_nocache(void __user *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> +{
> + return __copy_from_user_inatomic_nocache(to + progress, iter_from, len);
> +}
> +
> size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
> {
> if (WARN_ON_ONCE(!i->data_source))
> return 0;
>
> - iterate_and_advance(i, bytes, base, len, off,
> - __copy_from_user_inatomic_nocache(addr + off, base, len),
> - memcpy(addr + off, base, len)
> - )
> -
> - return bytes;
> + return iterate_and_advance(i, bytes, addr,
> + copy_from_user_iter_nocache,
> + memcpy_from_iter);
> }
> EXPORT_SYMBOL(_copy_from_iter_nocache);
>
> #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +static __always_inline
> +size_t copy_from_user_iter_flushcache(void __user *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> +{
> + return __copy_from_user_flushcache(to + progress, iter_from, len);
> +}
> +
> +static __always_inline
> +size_t memcpy_from_iter_flushcache(void *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> +{
> + memcpy_flushcache(to + progress, iter_from, len);
> + return 0;
> +}
> +
> /**
> * _copy_from_iter_flushcache - write destination through cpu cache
> * @addr: destination kernel address
> @@ -431,12 +331,9 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
> if (WARN_ON_ONCE(!i->data_source))
> return 0;
>
> - iterate_and_advance(i, bytes, base, len, off,
> - __copy_from_user_flushcache(addr + off, base, len),
> - memcpy_flushcache(addr + off, base, len)
> - )
> -
> - return bytes;
> + return iterate_and_advance(i, bytes, addr,
> + copy_from_user_iter_flushcache,
> + memcpy_from_iter_flushcache);
> }
> EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
> #endif
> @@ -508,10 +405,9 @@ size_t copy_page_to_iter_nofault(struct page *page, unsigned offset, size_t byte
> void *kaddr = kmap_local_page(page);
> size_t n = min(bytes, (size_t)PAGE_SIZE - offset);
>
> - iterate_and_advance(i, n, base, len, off,
> - copyout_nofault(base, kaddr + offset + off, len),
> - memcpy(base, kaddr + offset + off, len)
> - )
> + n = iterate_and_advance(i, bytes, kaddr,
> + copy_to_user_iter_nofault,
> + memcpy_to_iter);
> kunmap_local(kaddr);
> res += n;
> bytes -= n;
> @@ -554,14 +450,25 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
> }
> EXPORT_SYMBOL(copy_page_from_iter);
>
> -size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
> +static __always_inline
> +size_t zero_to_user_iter(void __user *iter_to, size_t progress,
> + size_t len, void *priv, void *priv2)
> {
> - iterate_and_advance(i, bytes, base, len, count,
> - clear_user(base, len),
> - memset(base, 0, len)
> - )
> + return clear_user(iter_to, len);
> +}
>
> - return bytes;
> +static __always_inline
> +size_t zero_to_iter(void *iter_to, size_t progress,
> + size_t len, void *priv, void *priv2)
> +{
> + memset(iter_to, 0, len);
> + return 0;
> +}
> +
> +size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
> +{
> + return iterate_and_advance(i, bytes, NULL,
> + zero_to_user_iter, zero_to_iter);
> }
> EXPORT_SYMBOL(iov_iter_zero);
>
> @@ -586,10 +493,9 @@ size_t copy_page_from_iter_atomic(struct page *page, size_t offset,
> }
>
> p = kmap_atomic(page) + offset;
> - iterate_and_advance(i, n, base, len, off,
> - copyin(p + off, base, len),
> - memcpy_from_iter(i, p + off, base, len)
> - )
> + n = iterate_and_advance2(i, n, p, i,
> + copy_from_user_iter,
> + memcpy_from_iter_mc);
> kunmap_atomic(p);
> copied += n;
> offset += n;
> @@ -1180,32 +1086,64 @@ ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i,
> }
> EXPORT_SYMBOL(iov_iter_get_pages_alloc2);
>
> +static __always_inline
> +size_t copy_from_user_iter_csum(void __user *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> +{
> + __wsum next, *csum = priv2;
> +
> + next = csum_and_copy_from_user(iter_from, to + progress, len);
> + *csum = csum_block_add(*csum, next, progress);
> + return next ? 0 : len;
> +}
> +
> +static __always_inline
> +size_t memcpy_from_iter_csum(void *iter_from, size_t progress,
> + size_t len, void *to, void *priv2)
> +{
> + __wsum *csum = priv2;
> +
> + *csum = csum_and_memcpy(to + progress, iter_from, len, *csum, progress);
> + return 0;
> +}
> +
> size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
> struct iov_iter *i)
> {
> - __wsum sum, next;
> - sum = *csum;
> if (WARN_ON_ONCE(!i->data_source))
> return 0;
> -
> - iterate_and_advance(i, bytes, base, len, off, ({
> - next = csum_and_copy_from_user(base, addr + off, len);
> - sum = csum_block_add(sum, next, off);
> - next ? 0 : len;
> - }), ({
> - sum = csum_and_memcpy(addr + off, base, len, sum, off);
> - })
> - )
> - *csum = sum;
> - return bytes;
> + return iterate_and_advance2(i, bytes, addr, csum,
> + copy_from_user_iter_csum,
> + memcpy_from_iter_csum);
> }
> EXPORT_SYMBOL(csum_and_copy_from_iter);
>
> +static __always_inline
> +size_t copy_to_user_iter_csum(void __user *iter_to, size_t progress,
> + size_t len, void *from, void *priv2)
> +{
> + __wsum next, *csum = priv2;
> +
> + next = csum_and_copy_to_user(from + progress, iter_to, len);
> + *csum = csum_block_add(*csum, next, progress);
> + return next ? 0 : len;
> +}
> +
> +static __always_inline
> +size_t memcpy_to_iter_csum(void *iter_to, size_t progress,
> + size_t len, void *from, void *priv2)
> +{
> + __wsum *csum = priv2;
> +
> + *csum = csum_and_memcpy(iter_to, from + progress, len, *csum, progress);
> + return 0;
> +}
> +
> size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
> struct iov_iter *i)
> {
> struct csum_state *csstate = _csstate;
> - __wsum sum, next;
> + __wsum sum;
>
> if (WARN_ON_ONCE(i->data_source))
> return 0;
> @@ -1219,14 +1157,10 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
> }
>
> sum = csum_shift(csstate->csum, csstate->off);
> - iterate_and_advance(i, bytes, base, len, off, ({
> - next = csum_and_copy_to_user(addr + off, base, len);
> - sum = csum_block_add(sum, next, off);
> - next ? 0 : len;
> - }), ({
> - sum = csum_and_memcpy(base, addr + off, len, sum, off);
> - })
> - )
> +
> + bytes = iterate_and_advance2(i, bytes, (void *)addr, &sum,
> + copy_to_user_iter_csum,
> + memcpy_to_iter_csum);
> csstate->csum = csum_shift(sum, csstate->off);
> csstate->off += bytes;
> return bytes;
>
>
Powered by blists - more mailing lists