lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <db84f6af-ed8e-4fda-8491-f4b2ba90842b@kernel.dk>
Date: Wed, 16 Apr 2025 16:42:17 -0600
From: Jens Axboe <axboe@...nel.dk>
To: Pavel Begunkov <asml.silence@...il.com>,
 Nitesh Shetty <nitheshshetty@...il.com>
Cc: Nitesh Shetty <nj.shetty@...sung.com>, gost.dev@...sung.com,
 io-uring@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] io_uring/rsrc: send exact nr_segs for fixed buffer

On 4/16/25 4:23 PM, Jens Axboe wrote:
>>>> Should we just make it saner first? Sth like these 3 completely
>>>> untested commits
>>>>
>>>> https://github.com/isilence/linux/commits/rsrc-import-cleanup/
>>>>
>>>> And then it'll become
>>>>
>>>> nr_segs = ALIGN(offset + len, 1UL << folio_shift);
>>>
>>> Let's please do that, certainly an improvement. Care to send this out? I
>>> can toss them at the testing. And we'd still need that last patch to
>>
>> I need to test it first, perhaps tomorrow
> 
> Sounds good, I'll run it through testing here too. Would be nice to
> stuff in for -rc3, it's pretty minimal and honestly makes the code much
> easier to read and reason about.
> 
>>> ensure the segment count is correct. Honestly somewhat surprised that
>>
>> Right, I can pick up the Nitesh's patch to that.
> 
> Sounds good.
> 
>>> the only odd fallout of that is (needlessly) hitting the bio split path.
>>
>> It's perfectly correct from the iter standpoint, AFAIK, length
>> and nr of segments don't have to match. Though I am surprised
>> it causes perf issues in the split path.
> 
> Theoretically it is, but it always makes me a bit nervous as there are
> some _really_ odd iov_iter use cases out there. And passing down known
> wrong segment counts is pretty wonky.
> 
>> Btw, where exactly does it stumble in there? I'd assume we don't
> 
> Because segments != 1, and then that hits the slower path.
> 
>> need to do the segment correction for kbuf as the bio splitting
>> can do it (and probably does) in exactly the same way?
> 
> It doesn't strictly need to, but we should handle that case too. That'd
> basically just be the loop addition I already did, something ala the
> below on top for both of them:

Made a silly typo in the last patch (updated below), but with that
fixed, tested your 3 patches and that one on top and it passes both
liburing tests and kselftests for ublk (which does test kbuf imports)
too. Tested segment counting with a separate test case too, and it looks
good as well.


diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index d8fa7158e598..7abc96b9260d 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -1032,6 +1032,26 @@ static int validate_fixed_range(u64 buf_addr, size_t len,
 	return 0;
 }
 
+static int io_import_kbuf(int ddir, struct iov_iter *iter,
+			  struct io_mapped_ubuf *imu, size_t len, size_t offset)
+{
+	size_t count = len + offset;
+
+	iov_iter_bvec(iter, ddir, imu->bvec, imu->nr_bvecs, count);
+	iov_iter_advance(iter, offset);
+
+	if (count < imu->len) {
+		const struct bio_vec *bvec = iter->bvec;
+
+		while (len > bvec->bv_len) {
+			len -= bvec->bv_len;
+			bvec++;
+		}
+		iter->nr_segs = 1 + bvec - iter->bvec;
+	}
+	return 0;
+}
+
 static int io_import_fixed(int ddir, struct iov_iter *iter,
 			   struct io_mapped_ubuf *imu,
 			   u64 buf_addr, size_t len)
@@ -1054,13 +1074,8 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
 	 * and advance us to the beginning.
 	 */
 	offset = buf_addr - imu->ubuf;
-	bvec = imu->bvec;
-
-	if (imu->is_kbuf) {
-		iov_iter_bvec(iter, ddir, bvec, imu->nr_bvecs, offset + len);
-		iov_iter_advance(iter, offset);
-		return 0;
-	}
+	if (imu->is_kbuf)
+		return io_import_kbuf(ddir, iter, imu, len, offset);
 
 	/*
 	 * Don't use iov_iter_advance() here, as it's really slow for
@@ -1083,7 +1098,7 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
 	 * have the size property of user registered ones, so we have
 	 * to use the slow iter advance.
 	 */
-
+	bvec = imu->bvec;
 	if (offset >= bvec->bv_len) {
 		unsigned long seg_skip;
 
@@ -1094,7 +1109,7 @@ static int io_import_fixed(int ddir, struct iov_iter *iter,
 		offset &= (1UL << imu->folio_shift) - 1;
 	}
 
-	nr_segs = imu->nr_bvecs - (bvec - imu->bvec);
+	nr_segs = ALIGN(offset + len, 1UL << imu->folio_shift) >> imu->folio_shift;
 	iov_iter_bvec(iter, ddir, bvec, nr_segs, len);
 	iter->iov_offset = offset;
 	return 0;

-- 
Jens Axboe

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ