lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <40676c7a-daee-8ef4-340f-d8573556ae10@huawei.com>
Date: Tue, 27 Feb 2024 20:43:34 +0800
From: Tong Tiangen <tongtiangen@...wei.com>
To: David Howells <dhowells@...hat.com>, Jens Axboe <axboe@...nel.dk>
CC: Al Viro <viro@...iv.linux.org.uk>, Linus Torvalds
	<torvalds@...ux-foundation.org>, Christoph Hellwig <hch@....de>, Christian
 Brauner <christian@...uner.io>, David Laight <David.Laight@...LAB.COM>,
	Matthew Wilcox <willy@...radead.org>, Jeff Layton <jlayton@...nel.org>,
	<linux-fsdevel@...r.kernel.org>, <linux-block@...r.kernel.org>,
	<linux-mm@...ck.org>, <netdev@...r.kernel.org>,
	<linux-kernel@...r.kernel.org>, Kefeng Wang <wangkefeng.wang@...wei.com>
Subject: [bug report] dead loop in generic_perform_write() //Re: [PATCH v7
 07/12] iov_iter: Convert iterate*() to inline funcs

Hi David, Jens:

Kindly ping...

Thanks.
Tong.


在 2024/2/18 11:13, Tong Tiangen 写道:
> Hi David, Jens:
> 
> Recently, I tested the x86 coredump function of the user process in the
> mainline (6.8-rc1) and found an deadloop issue related to this patch.
> 
> Let's discuss it.
> 
> 1. Test step:
> ----------------------------
>    a. Start a user process.
>    b. Use EINJ to inject a hardware memory error into a page of
>       the this user process.
>    c. Send SIGBUS to this user process.
>    d. After receiving the signal, a coredump file is configured to be
>       written to tmpfs.
> 
> 2. Root cause:
> ----------------------------
> The deadloop occurs in generic_perform_write(), the call path:
> 
> elf_core_dump()
>    -> dump_user_range()
>      -> dump_emit_page()
>        -> iov_iter_bvec()  //iter type set to BVEC
>          -> iov_iter_set_copy_mc(&iter);  //support copy mc
>            -> __kernel_write_iter()
>              -> shmem_file_write_iter()
>                -> generic_perform_write()
> 
> ssize_t generic_perform_write(...)
> {
>      [...]
>      do {
>          [...]
>      again:
>          //[4]
>          if (unlikely(fault_in_iov_iter_readable(i, bytes) ==
>                               bytes)) {
>              status = -EFAULT;
>              break;
>          }
>          //[5]
>          if (fatal_signal_pending(current)) {
>              status = -EINTR;
>              break;
>          }
> 
>              [...]
> 
>          //[1]
>          copied = copy_page_from_iter_atomic(page, offset, bytes,
>                           i);
>          [...]
> 
>          //[2]
>          status = a_ops->write_end(...);
>          if (unlikely(status != copied)) {
>              iov_iter_revert(i, copied - max(status, 0L));
>              if (unlikely(status < 0))
>                  break;
>          }
>          cond_resched();
> 
>          if (unlikely(status == 0)) {
>              /*
>              * A short copy made ->write_end() reject the
>              * thing entirely.  Might be memory poisoning
>              * halfway through, might be a race with munmap,
>              * might be severe memory pressure.
>              */
>              if (copied)
>                  bytes = copied;
>              //----[3]
>              goto again;
>          }
>          [...]
>      } while (iov_iter_count(i));
>      [...]
> }
> 
> [1]Before this patch:
>    copy_page_from_iter_atomic()
>      -> iterate_and_advance()
>         -> __iterate_and_advance(..., ((void)(K),0))
>           ->iterate_bvec macro
>             -> left = ((void)(K),0)
> 
> With CONFIG_ARCH_HAS_COPY_MC, the K() is copy_mc_to_kernel() which
> return "bytes not copied".
> 
> When a memory error occurs during K(), the value of "left" must be 0.
> Therefore, the value of "copied" returned by
> copy_page_from_iter_atomic() is not 0, and the loop of
> generic_perform_write() can be ended normally.
> 
> 
> After this patch:
>    copy_page_from_iter_atomic()
>      -> iterate_and_advance2()
>        -> iterate_bvec()
>          -> remain = step()
> 
> With CONFIG_ARCH_HAS_COPY_MC, the step() is copy_mc_to_kernel() which
> return "bytes not copied".
> 
> When a memory error occurs during step(), the value of "left" equal to
> the value of "part" (no one byte is copied successfully). In this case,
> iterate_bvec() returns 0, and copy_page_from_iter_atomic() also returns
> 0. The callback shmem_write_end()[2] also returns 0. Finally,
> generic_perform_write() goes to "goto again"[3], and the loop restarts.
> 4][5] cannot enter and exit the loop, then deadloop occurs.
> 
> Thanks.
> Tong
> 
> 
> 在 2023/9/25 20:03, David Howells 写道:
>> Convert the iov_iter iteration macros to inline functions to make the 
>> code
>> easier to follow.
>>
>> The functions are marked __always_inline as we don't want to end up with
>> indirect calls in the code.  This, however, leaves dealing with ->copy_mc
>> in an awkard situation since the step function (memcpy_from_iter_mc())
>> needs to test the flag in the iterator, but isn't passed the iterator.
>> This will be dealt with in a follow-up patch.
>>
>> The variable names in the per-type iterator functions have been 
>> harmonised
>> as much as possible and made clearer as to the variable purpose.
>>
>> The iterator functions are also moved to a header file so that other
>> operations that need to scan over an iterator can be added.  For 
>> instance,
>> the rbd driver could use this to scan a buffer to see if it is all zeros
>> and libceph could use this to generate a crc.
>>
>> Signed-off-by: David Howells <dhowells@...hat.com>
>> cc: Alexander Viro <viro@...iv.linux.org.uk>
>> cc: Jens Axboe <axboe@...nel.dk>
>> cc: Christoph Hellwig <hch@....de>
>> cc: Christian Brauner <christian@...uner.io>
>> cc: Matthew Wilcox <willy@...radead.org>
>> cc: Linus Torvalds <torvalds@...ux-foundation.org>
>> cc: David Laight <David.Laight@...LAB.COM>
>> cc: linux-block@...r.kernel.org
>> cc: linux-fsdevel@...r.kernel.org
>> cc: linux-mm@...ck.org
>> Link: 
>> https://lore.kernel.org/r/3710261.1691764329@warthog.procyon.org.uk/ # v1
>> Link: https://lore.kernel.org/r/855.1692047347@warthog.procyon.org.uk/ 
>> # v2
>> Link: 
>> https://lore.kernel.org/r/20230816120741.534415-1-dhowells@redhat.com/ 
>> # v3
>> ---
>>
>> Notes:
>>      Changes
>>      =======
>>      ver #5)
>>       - Merge in patch to move iteration framework to a header file.
>>       - Move "iter->count - progress" into individual iteration 
>> subfunctions.
>>
>>   include/linux/iov_iter.h | 274 ++++++++++++++++++++++++++
>>   lib/iov_iter.c           | 416 ++++++++++++++++-----------------------
>>   2 files changed, 449 insertions(+), 241 deletions(-)
>>   create mode 100644 include/linux/iov_iter.h
>>
>> diff --git a/include/linux/iov_iter.h b/include/linux/iov_iter.h
>> new file mode 100644
>> index 000000000000..270454a6703d
>> --- /dev/null
>> +++ b/include/linux/iov_iter.h
>> @@ -0,0 +1,274 @@
>> +/* SPDX-License-Identifier: GPL-2.0-or-later */
>> +/* I/O iterator iteration building functions.
>> + *
>> + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
>> + * Written by David Howells (dhowells@...hat.com)
>> + */
>> +
>> +#ifndef _LINUX_IOV_ITER_H
>> +#define _LINUX_IOV_ITER_H
>> +
>> +#include <linux/uio.h>
>> +#include <linux/bvec.h>
>> +
>> +typedef size_t (*iov_step_f)(void *iter_base, size_t progress, size_t 
>> len,
>> +                 void *priv, void *priv2);
>> +typedef size_t (*iov_ustep_f)(void __user *iter_base, size_t 
>> progress, size_t len,
>> +                  void *priv, void *priv2);
>> +
>> +/*
>> + * Handle ITER_UBUF.
>> + */
>> +static __always_inline
>> +size_t iterate_ubuf(struct iov_iter *iter, size_t len, void *priv, 
>> void *priv2,
>> +            iov_ustep_f step)
>> +{
>> +    void __user *base = iter->ubuf;
>> +    size_t progress = 0, remain;
>> +
>> +    remain = step(base + iter->iov_offset, 0, len, priv, priv2);
>> +    progress = len - remain;
>> +    iter->iov_offset += progress;
>> +    iter->count -= progress;
>> +    return progress;
>> +}
>> +
>> +/*
>> + * Handle ITER_IOVEC.
>> + */
>> +static __always_inline
>> +size_t iterate_iovec(struct iov_iter *iter, size_t len, void *priv, 
>> void *priv2,
>> +             iov_ustep_f step)
>> +{
>> +    const struct iovec *p = iter->__iov;
>> +    size_t progress = 0, skip = iter->iov_offset;
>> +
>> +    do {
>> +        size_t remain, consumed;
>> +        size_t part = min(len, p->iov_len - skip);
>> +
>> +        if (likely(part)) {
>> +            remain = step(p->iov_base + skip, progress, part, priv, 
>> priv2);
>> +            consumed = part - remain;
>> +            progress += consumed;
>> +            skip += consumed;
>> +            len -= consumed;
>> +            if (skip < p->iov_len)
>> +                break;
>> +        }
>> +        p++;
>> +        skip = 0;
>> +    } while (len);
>> +
>> +    iter->nr_segs -= p - iter->__iov;
>> +    iter->__iov = p;
>> +    iter->iov_offset = skip;
>> +    iter->count -= progress;
>> +    return progress;
>> +}
>> +
>> +/*
>> + * Handle ITER_KVEC.
>> + */
>> +static __always_inline
>> +size_t iterate_kvec(struct iov_iter *iter, size_t len, void *priv, 
>> void *priv2,
>> +            iov_step_f step)
>> +{
>> +    const struct kvec *p = iter->kvec;
>> +    size_t progress = 0, skip = iter->iov_offset;
>> +
>> +    do {
>> +        size_t remain, consumed;
>> +        size_t part = min(len, p->iov_len - skip);
>> +
>> +        if (likely(part)) {
>> +            remain = step(p->iov_base + skip, progress, part, priv, 
>> priv2);
>> +            consumed = part - remain;
>> +            progress += consumed;
>> +            skip += consumed;
>> +            len -= consumed;
>> +            if (skip < p->iov_len)
>> +                break;
>> +        }
>> +        p++;
>> +        skip = 0;
>> +    } while (len);
>> +
>> +    iter->nr_segs -= p - iter->kvec;
>> +    iter->kvec = p;
>> +    iter->iov_offset = skip;
>> +    iter->count -= progress;
>> +    return progress;
>> +}
>> +
>> +/*
>> + * Handle ITER_BVEC.
>> + */
>> +static __always_inline
>> +size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, 
>> void *priv2,
>> +            iov_step_f step)
>> +{
>> +    const struct bio_vec *p = iter->bvec;
>> +    size_t progress = 0, skip = iter->iov_offset;
>> +
>> +    do {
>> +        size_t remain, consumed;
>> +        size_t offset = p->bv_offset + skip, part;
>> +        void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE);
>> +
>> +        part = min3(len,
>> +               (size_t)(p->bv_len - skip),
>> +               (size_t)(PAGE_SIZE - offset % PAGE_SIZE));
>> +        remain = step(kaddr + offset % PAGE_SIZE, progress, part, 
>> priv, priv2);
>> +        kunmap_local(kaddr);
>> +        consumed = part - remain;
>> +        len -= consumed;
>> +        progress += consumed;
>> +        skip += consumed;
>> +        if (skip >= p->bv_len) {
>> +            skip = 0;
>> +            p++;
>> +        }
>> +        if (remain)
>> +            break;
>> +    } while (len);
>> +
>> +    iter->nr_segs -= p - iter->bvec;
>> +    iter->bvec = p;
>> +    iter->iov_offset = skip;
>> +    iter->count -= progress;
>> +    return progress;
>> +}
>> +
>> +/*
>> + * Handle ITER_XARRAY.
>> + */
>> +static __always_inline
>> +size_t iterate_xarray(struct iov_iter *iter, size_t len, void *priv, 
>> void *priv2,
>> +              iov_step_f step)
>> +{
>> +    struct folio *folio;
>> +    size_t progress = 0;
>> +    loff_t start = iter->xarray_start + iter->iov_offset;
>> +    pgoff_t index = start / PAGE_SIZE;
>> +    XA_STATE(xas, iter->xarray, index);
>> +
>> +    rcu_read_lock();
>> +    xas_for_each(&xas, folio, ULONG_MAX) {
>> +        size_t remain, consumed, offset, part, flen;
>> +
>> +        if (xas_retry(&xas, folio))
>> +            continue;
>> +        if (WARN_ON(xa_is_value(folio)))
>> +            break;
>> +        if (WARN_ON(folio_test_hugetlb(folio)))
>> +            break;
>> +
>> +        offset = offset_in_folio(folio, start + progress);
>> +        flen = min(folio_size(folio) - offset, len);
>> +
>> +        while (flen) {
>> +            void *base = kmap_local_folio(folio, offset);
>> +
>> +            part = min_t(size_t, flen,
>> +                     PAGE_SIZE - offset_in_page(offset));
>> +            remain = step(base, progress, part, priv, priv2);
>> +            kunmap_local(base);
>> +
>> +            consumed = part - remain;
>> +            progress += consumed;
>> +            len -= consumed;
>> +
>> +            if (remain || len == 0)
>> +                goto out;
>> +            flen -= consumed;
>> +            offset += consumed;
>> +        }
>> +    }
>> +
>> +out:
>> +    rcu_read_unlock();
>> +    iter->iov_offset += progress;
>> +    iter->count -= progress;
>> +    return progress;
>> +}
>> +
>> +/*
>> + * Handle ITER_DISCARD.
>> + */
>> +static __always_inline
>> +size_t iterate_discard(struct iov_iter *iter, size_t len, void *priv, 
>> void *priv2,
>> +              iov_step_f step)
>> +{
>> +    size_t progress = len;
>> +
>> +    iter->count -= progress;
>> +    return progress;
>> +}
>> +
>> +/**
>> + * iterate_and_advance2 - Iterate over an iterator
>> + * @iter: The iterator to iterate over.
>> + * @len: The amount to iterate over.
>> + * @priv: Data for the step functions.
>> + * @priv2: More data for the step functions.
>> + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
>> + * @step: Function for other iterators; given kernel addresses.
>> + *
>> + * Iterate over the next part of an iterator, up to the specified 
>> length.  The
>> + * buffer is presented in segments, which for kernel iteration are 
>> broken up by
>> + * physical pages and mapped, with the mapped address being presented.
>> + *
>> + * Two step functions, @step and @ustep, must be provided, one for 
>> handling
>> + * mapped kernel addresses and the other is given user addresses 
>> which have the
>> + * potential to fault since no pinning is performed.
>> + *
>> + * The step functions are passed the address and length of the 
>> segment, @priv,
>> + * @priv2 and the amount of data so far iterated over (which can, for 
>> example,
>> + * be added to @priv to point to the right part of a second buffer).  
>> The step
>> + * functions should return the amount of the segment they didn't 
>> process (ie. 0
>> + * indicates complete processsing).
>> + *
>> + * This function returns the amount of data processed (ie. 0 means 
>> nothing was
>> + * processed and the value of @len means processes to completion).
>> + */
>> +static __always_inline
>> +size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void 
>> *priv,
>> +                void *priv2, iov_ustep_f ustep, iov_step_f step)
>> +{
>> +    if (unlikely(iter->count < len))
>> +        len = iter->count;
>> +    if (unlikely(!len))
>> +        return 0;
>> +
>> +    if (likely(iter_is_ubuf(iter)))
>> +        return iterate_ubuf(iter, len, priv, priv2, ustep);
>> +    if (likely(iter_is_iovec(iter)))
>> +        return iterate_iovec(iter, len, priv, priv2, ustep);
>> +    if (iov_iter_is_bvec(iter))
>> +        return iterate_bvec(iter, len, priv, priv2, step);
>> +    if (iov_iter_is_kvec(iter))
>> +        return iterate_kvec(iter, len, priv, priv2, step);
>> +    if (iov_iter_is_xarray(iter))
>> +        return iterate_xarray(iter, len, priv, priv2, step);
>> +    return iterate_discard(iter, len, priv, priv2, step);
>> +}
>> +
>> +/**
>> + * iterate_and_advance - Iterate over an iterator
>> + * @iter: The iterator to iterate over.
>> + * @len: The amount to iterate over.
>> + * @priv: Data for the step functions.
>> + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses.
>> + * @step: Function for other iterators; given kernel addresses.
>> + *
>> + * As iterate_and_advance2(), but priv2 is always NULL.
>> + */
>> +static __always_inline
>> +size_t iterate_and_advance(struct iov_iter *iter, size_t len, void 
>> *priv,
>> +               iov_ustep_f ustep, iov_step_f step)
>> +{
>> +    return iterate_and_advance2(iter, len, priv, NULL, ustep, step);
>> +}
>> +
>> +#endif /* _LINUX_IOV_ITER_H */
>> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
>> index 227c9f536b94..65374ee91ecd 100644
>> --- a/lib/iov_iter.c
>> +++ b/lib/iov_iter.c
>> @@ -13,189 +13,69 @@
>>   #include <net/checksum.h>
>>   #include <linux/scatterlist.h>
>>   #include <linux/instrumented.h>
>> +#include <linux/iov_iter.h>
>> -/* covers ubuf and kbuf alike */
>> -#define iterate_buf(i, n, base, len, off, __p, STEP) {        \
>> -    size_t __maybe_unused off = 0;                \
>> -    len = n;                        \
>> -    base = __p + i->iov_offset;                \
>> -    len -= (STEP);                        \
>> -    i->iov_offset += len;                    \
>> -    n = len;                        \
>> -}
>> -
>> -/* covers iovec and kvec alike */
>> -#define iterate_iovec(i, n, base, len, off, __p, STEP) {    \
>> -    size_t off = 0;                        \
>> -    size_t skip = i->iov_offset;                \
>> -    do {                            \
>> -        len = min(n, __p->iov_len - skip);        \
>> -        if (likely(len)) {                \
>> -            base = __p->iov_base + skip;        \
>> -            len -= (STEP);                \
>> -            off += len;                \
>> -            skip += len;                \
>> -            n -= len;                \
>> -            if (skip < __p->iov_len)        \
>> -                break;                \
>> -        }                        \
>> -        __p++;                        \
>> -        skip = 0;                    \
>> -    } while (n);                        \
>> -    i->iov_offset = skip;                    \
>> -    n = off;                        \
>> -}
>> -
>> -#define iterate_bvec(i, n, base, len, off, p, STEP) {        \
>> -    size_t off = 0;                        \
>> -    unsigned skip = i->iov_offset;                \
>> -    while (n) {                        \
>> -        unsigned offset = p->bv_offset + skip;        \
>> -        unsigned left;                    \
>> -        void *kaddr = kmap_local_page(p->bv_page +    \
>> -                    offset / PAGE_SIZE);    \
>> -        base = kaddr + offset % PAGE_SIZE;        \
>> -        len = min(min(n, (size_t)(p->bv_len - skip)),    \
>> -             (size_t)(PAGE_SIZE - offset % PAGE_SIZE));    \
>> -        left = (STEP);                    \
>> -        kunmap_local(kaddr);                \
>> -        len -= left;                    \
>> -        off += len;                    \
>> -        skip += len;                    \
>> -        if (skip == p->bv_len) {            \
>> -            skip = 0;                \
>> -            p++;                    \
>> -        }                        \
>> -        n -= len;                    \
>> -        if (left)                    \
>> -            break;                    \
>> -    }                            \
>> -    i->iov_offset = skip;                    \
>> -    n = off;                        \
>> -}
>> -
>> -#define iterate_xarray(i, n, base, len, __off, STEP) {        \
>> -    __label__ __out;                    \
>> -    size_t __off = 0;                    \
>> -    struct folio *folio;                    \
>> -    loff_t start = i->xarray_start + i->iov_offset;        \
>> -    pgoff_t index = start / PAGE_SIZE;            \
>> -    XA_STATE(xas, i->xarray, index);            \
>> -                                \
>> -    len = PAGE_SIZE - offset_in_page(start);        \
>> -    rcu_read_lock();                    \
>> -    xas_for_each(&xas, folio, ULONG_MAX) {            \
>> -        unsigned left;                    \
>> -        size_t offset;                    \
>> -        if (xas_retry(&xas, folio))            \
>> -            continue;                \
>> -        if (WARN_ON(xa_is_value(folio)))        \
>> -            break;                    \
>> -        if (WARN_ON(folio_test_hugetlb(folio)))        \
>> -            break;                    \
>> -        offset = offset_in_folio(folio, start + __off);    \
>> -        while (offset < folio_size(folio)) {        \
>> -            base = kmap_local_folio(folio, offset);    \
>> -            len = min(n, len);            \
>> -            left = (STEP);                \
>> -            kunmap_local(base);            \
>> -            len -= left;                \
>> -            __off += len;                \
>> -            n -= len;                \
>> -            if (left || n == 0)            \
>> -                goto __out;            \
>> -            offset += len;                \
>> -            len = PAGE_SIZE;            \
>> -        }                        \
>> -    }                            \
>> -__out:                                \
>> -    rcu_read_unlock();                    \
>> -    i->iov_offset += __off;                    \
>> -    n = __off;                        \
>> -}
>> -
>> -#define __iterate_and_advance(i, n, base, len, off, I, K) {    \
>> -    if (unlikely(i->count < n))                \
>> -        n = i->count;                    \
>> -    if (likely(n)) {                    \
>> -        if (likely(iter_is_ubuf(i))) {            \
>> -            void __user *base;            \
>> -            size_t len;                \
>> -            iterate_buf(i, n, base, len, off,    \
>> -                        i->ubuf, (I))     \
>> -        } else if (likely(iter_is_iovec(i))) {        \
>> -            const struct iovec *iov = iter_iov(i);    \
>> -            void __user *base;            \
>> -            size_t len;                \
>> -            iterate_iovec(i, n, base, len, off,    \
>> -                        iov, (I))    \
>> -            i->nr_segs -= iov - iter_iov(i);    \
>> -            i->__iov = iov;                \
>> -        } else if (iov_iter_is_bvec(i)) {        \
>> -            const struct bio_vec *bvec = i->bvec;    \
>> -            void *base;                \
>> -            size_t len;                \
>> -            iterate_bvec(i, n, base, len, off,    \
>> -                        bvec, (K))    \
>> -            i->nr_segs -= bvec - i->bvec;        \
>> -            i->bvec = bvec;                \
>> -        } else if (iov_iter_is_kvec(i)) {        \
>> -            const struct kvec *kvec = i->kvec;    \
>> -            void *base;                \
>> -            size_t len;                \
>> -            iterate_iovec(i, n, base, len, off,    \
>> -                        kvec, (K))    \
>> -            i->nr_segs -= kvec - i->kvec;        \
>> -            i->kvec = kvec;                \
>> -        } else if (iov_iter_is_xarray(i)) {        \
>> -            void *base;                \
>> -            size_t len;                \
>> -            iterate_xarray(i, n, base, len, off,    \
>> -                            (K))    \
>> -        }                        \
>> -        i->count -= n;                    \
>> -    }                            \
>> -}
>> -#define iterate_and_advance(i, n, base, len, off, I, K) \
>> -    __iterate_and_advance(i, n, base, len, off, I, ((void)(K),0))
>> -
>> -static int copyout(void __user *to, const void *from, size_t n)
>> +static __always_inline
>> +size_t copy_to_user_iter(void __user *iter_to, size_t progress,
>> +             size_t len, void *from, void *priv2)
>>   {
>>       if (should_fail_usercopy())
>> -        return n;
>> -    if (access_ok(to, n)) {
>> -        instrument_copy_to_user(to, from, n);
>> -        n = raw_copy_to_user(to, from, n);
>> +        return len;
>> +    if (access_ok(iter_to, len)) {
>> +        from += progress;
>> +        instrument_copy_to_user(iter_to, from, len);
>> +        len = raw_copy_to_user(iter_to, from, len);
>>       }
>> -    return n;
>> +    return len;
>>   }
>> -static int copyout_nofault(void __user *to, const void *from, size_t n)
>> +static __always_inline
>> +size_t copy_to_user_iter_nofault(void __user *iter_to, size_t progress,
>> +                 size_t len, void *from, void *priv2)
>>   {
>> -    long res;
>> +    ssize_t res;
>>       if (should_fail_usercopy())
>> -        return n;
>> -
>> -    res = copy_to_user_nofault(to, from, n);
>> +        return len;
>> -    return res < 0 ? n : res;
>> +    from += progress;
>> +    res = copy_to_user_nofault(iter_to, from, len);
>> +    return res < 0 ? len : res;
>>   }
>> -static int copyin(void *to, const void __user *from, size_t n)
>> +static __always_inline
>> +size_t copy_from_user_iter(void __user *iter_from, size_t progress,
>> +               size_t len, void *to, void *priv2)
>>   {
>> -    size_t res = n;
>> +    size_t res = len;
>>       if (should_fail_usercopy())
>> -        return n;
>> -    if (access_ok(from, n)) {
>> -        instrument_copy_from_user_before(to, from, n);
>> -        res = raw_copy_from_user(to, from, n);
>> -        instrument_copy_from_user_after(to, from, n, res);
>> +        return len;
>> +    if (access_ok(iter_from, len)) {
>> +        to += progress;
>> +        instrument_copy_from_user_before(to, iter_from, len);
>> +        res = raw_copy_from_user(to, iter_from, len);
>> +        instrument_copy_from_user_after(to, iter_from, len, res);
>>       }
>>       return res;
>>   }
>> +static __always_inline
>> +size_t memcpy_to_iter(void *iter_to, size_t progress,
>> +              size_t len, void *from, void *priv2)
>> +{
>> +    memcpy(iter_to, from + progress, len);
>> +    return 0;
>> +}
>> +
>> +static __always_inline
>> +size_t memcpy_from_iter(void *iter_from, size_t progress,
>> +            size_t len, void *to, void *priv2)
>> +{
>> +    memcpy(to + progress, iter_from, len);
>> +    return 0;
>> +}
>> +
>>   /*
>>    * fault_in_iov_iter_readable - fault in iov iterator for reading
>>    * @i: iterator
>> @@ -312,23 +192,29 @@ size_t _copy_to_iter(const void *addr, size_t 
>> bytes, struct iov_iter *i)
>>           return 0;
>>       if (user_backed_iter(i))
>>           might_fault();
>> -    iterate_and_advance(i, bytes, base, len, off,
>> -        copyout(base, addr + off, len),
>> -        memcpy(base, addr + off, len)
>> -    )
>> -
>> -    return bytes;
>> +    return iterate_and_advance(i, bytes, (void *)addr,
>> +                   copy_to_user_iter, memcpy_to_iter);
>>   }
>>   EXPORT_SYMBOL(_copy_to_iter);
>>   #ifdef CONFIG_ARCH_HAS_COPY_MC
>> -static int copyout_mc(void __user *to, const void *from, size_t n)
>> -{
>> -    if (access_ok(to, n)) {
>> -        instrument_copy_to_user(to, from, n);
>> -        n = copy_mc_to_user((__force void *) to, from, n);
>> +static __always_inline
>> +size_t copy_to_user_iter_mc(void __user *iter_to, size_t progress,
>> +                size_t len, void *from, void *priv2)
>> +{
>> +    if (access_ok(iter_to, len)) {
>> +        from += progress;
>> +        instrument_copy_to_user(iter_to, from, len);
>> +        len = copy_mc_to_user(iter_to, from, len);
>>       }
>> -    return n;
>> +    return len;
>> +}
>> +
>> +static __always_inline
>> +size_t memcpy_to_iter_mc(void *iter_to, size_t progress,
>> +             size_t len, void *from, void *priv2)
>> +{
>> +    return copy_mc_to_kernel(iter_to, from + progress, len);
>>   }
>>   /**
>> @@ -361,22 +247,20 @@ size_t _copy_mc_to_iter(const void *addr, size_t 
>> bytes, struct iov_iter *i)
>>           return 0;
>>       if (user_backed_iter(i))
>>           might_fault();
>> -    __iterate_and_advance(i, bytes, base, len, off,
>> -        copyout_mc(base, addr + off, len),
>> -        copy_mc_to_kernel(base, addr + off, len)
>> -    )
>> -
>> -    return bytes;
>> +    return iterate_and_advance(i, bytes, (void *)addr,
>> +                   copy_to_user_iter_mc, memcpy_to_iter_mc);
>>   }
>>   EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
>>   #endif /* CONFIG_ARCH_HAS_COPY_MC */
>> -static void *memcpy_from_iter(struct iov_iter *i, void *to, const 
>> void *from,
>> -                 size_t size)
>> +static size_t memcpy_from_iter_mc(void *iter_from, size_t progress,
>> +                  size_t len, void *to, void *priv2)
>>   {
>> -    if (iov_iter_is_copy_mc(i))
>> -        return (void *)copy_mc_to_kernel(to, from, size);
>> -    return memcpy(to, from, size);
>> +    struct iov_iter *iter = priv2;
>> +
>> +    if (iov_iter_is_copy_mc(iter))
>> +        return copy_mc_to_kernel(to + progress, iter_from, len);
>> +    return memcpy_from_iter(iter_from, progress, len, to, priv2);
>>   }
>>   size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
>> @@ -386,30 +270,46 @@ size_t _copy_from_iter(void *addr, size_t bytes, 
>> struct iov_iter *i)
>>       if (user_backed_iter(i))
>>           might_fault();
>> -    iterate_and_advance(i, bytes, base, len, off,
>> -        copyin(addr + off, base, len),
>> -        memcpy_from_iter(i, addr + off, base, len)
>> -    )
>> -
>> -    return bytes;
>> +    return iterate_and_advance2(i, bytes, addr, i,
>> +                    copy_from_user_iter,
>> +                    memcpy_from_iter_mc);
>>   }
>>   EXPORT_SYMBOL(_copy_from_iter);
>> +static __always_inline
>> +size_t copy_from_user_iter_nocache(void __user *iter_from, size_t 
>> progress,
>> +                   size_t len, void *to, void *priv2)
>> +{
>> +    return __copy_from_user_inatomic_nocache(to + progress, 
>> iter_from, len);
>> +}
>> +
>>   size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct 
>> iov_iter *i)
>>   {
>>       if (WARN_ON_ONCE(!i->data_source))
>>           return 0;
>> -    iterate_and_advance(i, bytes, base, len, off,
>> -        __copy_from_user_inatomic_nocache(addr + off, base, len),
>> -        memcpy(addr + off, base, len)
>> -    )
>> -
>> -    return bytes;
>> +    return iterate_and_advance(i, bytes, addr,
>> +                   copy_from_user_iter_nocache,
>> +                   memcpy_from_iter);
>>   }
>>   EXPORT_SYMBOL(_copy_from_iter_nocache);
>>   #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
>> +static __always_inline
>> +size_t copy_from_user_iter_flushcache(void __user *iter_from, size_t 
>> progress,
>> +                      size_t len, void *to, void *priv2)
>> +{
>> +    return __copy_from_user_flushcache(to + progress, iter_from, len);
>> +}
>> +
>> +static __always_inline
>> +size_t memcpy_from_iter_flushcache(void *iter_from, size_t progress,
>> +                   size_t len, void *to, void *priv2)
>> +{
>> +    memcpy_flushcache(to + progress, iter_from, len);
>> +    return 0;
>> +}
>> +
>>   /**
>>    * _copy_from_iter_flushcache - write destination through cpu cache
>>    * @addr: destination kernel address
>> @@ -431,12 +331,9 @@ size_t _copy_from_iter_flushcache(void *addr, 
>> size_t bytes, struct iov_iter *i)
>>       if (WARN_ON_ONCE(!i->data_source))
>>           return 0;
>> -    iterate_and_advance(i, bytes, base, len, off,
>> -        __copy_from_user_flushcache(addr + off, base, len),
>> -        memcpy_flushcache(addr + off, base, len)
>> -    )
>> -
>> -    return bytes;
>> +    return iterate_and_advance(i, bytes, addr,
>> +                   copy_from_user_iter_flushcache,
>> +                   memcpy_from_iter_flushcache);
>>   }
>>   EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
>>   #endif
>> @@ -508,10 +405,9 @@ size_t copy_page_to_iter_nofault(struct page 
>> *page, unsigned offset, size_t byte
>>           void *kaddr = kmap_local_page(page);
>>           size_t n = min(bytes, (size_t)PAGE_SIZE - offset);
>> -        iterate_and_advance(i, n, base, len, off,
>> -            copyout_nofault(base, kaddr + offset + off, len),
>> -            memcpy(base, kaddr + offset + off, len)
>> -        )
>> +        n = iterate_and_advance(i, bytes, kaddr,
>> +                    copy_to_user_iter_nofault,
>> +                    memcpy_to_iter);
>>           kunmap_local(kaddr);
>>           res += n;
>>           bytes -= n;
>> @@ -554,14 +450,25 @@ size_t copy_page_from_iter(struct page *page, 
>> size_t offset, size_t bytes,
>>   }
>>   EXPORT_SYMBOL(copy_page_from_iter);
>> -size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
>> +static __always_inline
>> +size_t zero_to_user_iter(void __user *iter_to, size_t progress,
>> +             size_t len, void *priv, void *priv2)
>>   {
>> -    iterate_and_advance(i, bytes, base, len, count,
>> -        clear_user(base, len),
>> -        memset(base, 0, len)
>> -    )
>> +    return clear_user(iter_to, len);
>> +}
>> -    return bytes;
>> +static __always_inline
>> +size_t zero_to_iter(void *iter_to, size_t progress,
>> +            size_t len, void *priv, void *priv2)
>> +{
>> +    memset(iter_to, 0, len);
>> +    return 0;
>> +}
>> +
>> +size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
>> +{
>> +    return iterate_and_advance(i, bytes, NULL,
>> +                   zero_to_user_iter, zero_to_iter);
>>   }
>>   EXPORT_SYMBOL(iov_iter_zero);
>> @@ -586,10 +493,9 @@ size_t copy_page_from_iter_atomic(struct page 
>> *page, size_t offset,
>>           }
>>           p = kmap_atomic(page) + offset;
>> -        iterate_and_advance(i, n, base, len, off,
>> -            copyin(p + off, base, len),
>> -            memcpy_from_iter(i, p + off, base, len)
>> -        )
>> +        n = iterate_and_advance2(i, n, p, i,
>> +                     copy_from_user_iter,
>> +                     memcpy_from_iter_mc);
>>           kunmap_atomic(p);
>>           copied += n;
>>           offset += n;
>> @@ -1180,32 +1086,64 @@ ssize_t iov_iter_get_pages_alloc2(struct 
>> iov_iter *i,
>>   }
>>   EXPORT_SYMBOL(iov_iter_get_pages_alloc2);
>> +static __always_inline
>> +size_t copy_from_user_iter_csum(void __user *iter_from, size_t progress,
>> +                size_t len, void *to, void *priv2)
>> +{
>> +    __wsum next, *csum = priv2;
>> +
>> +    next = csum_and_copy_from_user(iter_from, to + progress, len);
>> +    *csum = csum_block_add(*csum, next, progress);
>> +    return next ? 0 : len;
>> +}
>> +
>> +static __always_inline
>> +size_t memcpy_from_iter_csum(void *iter_from, size_t progress,
>> +                 size_t len, void *to, void *priv2)
>> +{
>> +    __wsum *csum = priv2;
>> +
>> +    *csum = csum_and_memcpy(to + progress, iter_from, len, *csum, 
>> progress);
>> +    return 0;
>> +}
>> +
>>   size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
>>                      struct iov_iter *i)
>>   {
>> -    __wsum sum, next;
>> -    sum = *csum;
>>       if (WARN_ON_ONCE(!i->data_source))
>>           return 0;
>> -
>> -    iterate_and_advance(i, bytes, base, len, off, ({
>> -        next = csum_and_copy_from_user(base, addr + off, len);
>> -        sum = csum_block_add(sum, next, off);
>> -        next ? 0 : len;
>> -    }), ({
>> -        sum = csum_and_memcpy(addr + off, base, len, sum, off);
>> -    })
>> -    )
>> -    *csum = sum;
>> -    return bytes;
>> +    return iterate_and_advance2(i, bytes, addr, csum,
>> +                    copy_from_user_iter_csum,
>> +                    memcpy_from_iter_csum);
>>   }
>>   EXPORT_SYMBOL(csum_and_copy_from_iter);
>> +static __always_inline
>> +size_t copy_to_user_iter_csum(void __user *iter_to, size_t progress,
>> +                  size_t len, void *from, void *priv2)
>> +{
>> +    __wsum next, *csum = priv2;
>> +
>> +    next = csum_and_copy_to_user(from + progress, iter_to, len);
>> +    *csum = csum_block_add(*csum, next, progress);
>> +    return next ? 0 : len;
>> +}
>> +
>> +static __always_inline
>> +size_t memcpy_to_iter_csum(void *iter_to, size_t progress,
>> +               size_t len, void *from, void *priv2)
>> +{
>> +    __wsum *csum = priv2;
>> +
>> +    *csum = csum_and_memcpy(iter_to, from + progress, len, *csum, 
>> progress);
>> +    return 0;
>> +}
>> +
>>   size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void 
>> *_csstate,
>>                    struct iov_iter *i)
>>   {
>>       struct csum_state *csstate = _csstate;
>> -    __wsum sum, next;
>> +    __wsum sum;
>>       if (WARN_ON_ONCE(i->data_source))
>>           return 0;
>> @@ -1219,14 +1157,10 @@ size_t csum_and_copy_to_iter(const void *addr, 
>> size_t bytes, void *_csstate,
>>       }
>>       sum = csum_shift(csstate->csum, csstate->off);
>> -    iterate_and_advance(i, bytes, base, len, off, ({
>> -        next = csum_and_copy_to_user(addr + off, base, len);
>> -        sum = csum_block_add(sum, next, off);
>> -        next ? 0 : len;
>> -    }), ({
>> -        sum = csum_and_memcpy(base, addr + off, len, sum, off);
>> -    })
>> -    )
>> +
>> +    bytes = iterate_and_advance2(i, bytes, (void *)addr, &sum,
>> +                     copy_to_user_iter_csum,
>> +                     memcpy_to_iter_csum);
>>       csstate->csum = csum_shift(sum, csstate->off);
>>       csstate->off += bytes;
>>       return bytes;
>>
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ