lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADUfDZo4Kbkodz3w-BRsSOEwTGeEQeb-yppmMNY5-ipG33B2qg@mail.gmail.com>
Date: Tue, 16 Dec 2025 21:33:39 -0800
From: Caleb Sander Mateos <csander@...estorage.com>
To: huang-jl <huang-jl@...pseek.com>
Cc: io-uring@...r.kernel.org, axboe@...nel.dk, ming.lei@...hat.com, 
	linux-block@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/01] io_uring: fix nr_segs calculation in io_import_kbuf

On Tue, Dec 16, 2025 at 8:02 PM huang-jl <huang-jl@...pseek.com> wrote:
>
> io_import_kbuf() calculates nr_segs incorrectly when iov_offset is
> non-zero after iov_iter_advance(). It doesn't account for the partial
> consumption of the first bvec.
>
> The problem comes when meet the following conditions:
> 1. Use UBLK_F_AUTO_BUF_REG feature of ublk.
> 2. The kernel will help to register the buffer, into the io uring.
> 3. Later, the ublk server try to send IO request using the registered
>    buffer in the io uring, to read/write to fuse-based filesystem, with
> O_DIRECT.
>
> From a userspace perspective, the ublk server thread is blocked in the
> kernel, and will see "soft lockup" in the kernel dmesg.
>
> When ublk registers a buffer with mixed-size bvecs like [4K]*6 + [12K]
> and a request partially consumes a bvec, the next request's nr_segs
> calculation uses bvec->bv_len instead of (bv_len - iov_offset).
>
> This causes fuse_get_user_pages() to loop forever because nr_segs
> indicates fewer pages than actually needed.
>
> Specifically, the infinite loop happens at:
> fuse_get_user_pages()
>   -> iov_iter_extract_pages()
>     -> iov_iter_extract_bvec_pages()
> Since the nr_segs is miscalculated, the iov_iter_extract_bvec_pages
> returns when finding that i->nr_segs is zero. Then
> iov_iter_extract_pages returns zero. However, fuse_get_user_pages does
> still not get enough data/pages, causing infinite loop.
>
> Example:
>   - Bvecs: [4K, 4K, 4K, 4K, 4K, 4K, 12K, ...]
>   - Request 1: 32K at offset 0, uses 6*4K + 8K of the 12K bvec
>   - Request 2: 32K at offset 32K
>     - iov_offset = 8K (8K already consumed from 12K bvec)
>     - Bug: calculates using 12K, not (12K - 8K) = 4K
>     - Result: nr_segs too small, infinite loop in fuse_get_user_pages.
>
> Fix by accounting for iov_offset when calculating the first segment's
> available length.

Please add a Fixes tag

>
> Signed-off-by: huang-jl <huang-jl@...pseek.com>
> ---
>  io_uring/rsrc.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
> index a63474b33..4eca0c18c 100644
> --- a/io_uring/rsrc.c
> +++ b/io_uring/rsrc.c
> @@ -1058,6 +1058,14 @@ static int io_import_kbuf(int ddir, struct iov_iter *iter,
>
>         if (count < imu->len) {
>                 const struct bio_vec *bvec = iter->bvec;
> +               size_t first_seg_len = bvec->bv_len - iter->iov_offset;
> +
> +               if (len <= first_seg_len) {
> +                       iter->nr_segs = 1;
> +                       return 0;
> +               }
> +               len -= first_seg_len;
> +               bvec++;

Would a simpler fix be just to add a len += iter->iov_offset before the loop?

Best,
Caleb

>
>                 while (len > bvec->bv_len) {
>                         len -= bvec->bv_len;
> --
> 2.43.0
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ