lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 25 Oct 2023 17:22:52 +0800
From:   <ed.tsai@...iatek.com>
To:     Jens Axboe <axboe@...nel.dk>,
        Matthias Brugger <matthias.bgg@...il.com>,
        AngeloGioacchino Del Regno 
        <angelogioacchino.delregno@...labora.com>
CC:     <wsd_upstream@...iatek.com>, <stanley.chu@...iatek.com>,
        <peter.wang@...iatek.com>, <alice.chao@...iatek.com>,
        <powen.kao@...iatek.com>, <naomi.chu@...iatek.com>,
        <will.shiu@...iatek.com>, <chun-hung.wu@...iatek.com>,
        <casper.li@...iatek.com>, Ed Tsai <ed.tsai@...iatek.com>,
        <linux-block@...r.kernel.org>, <linux-kernel@...r.kernel.org>,
        <linux-arm-kernel@...ts.infradead.org>,
        <linux-mediatek@...ts.infradead.org>
Subject: [PATCH 1/1] block: Check the queue limit before bio submitting

From: Ed Tsai <ed.tsai@...iatek.com>

Referring to commit 07173c3ec276 ("block: enable multipage bvecs"),
each bio_vec now holds more than one page, potentially exceeding
1MB in size and causing alignment issues with the queue limit.

In a sequential read/write scenario, the file system maximizes the
bio's capacity before submitting. However, misalignment with the
queue limit can result in the bio being split into smaller I/O
operations.

For instance, assuming the maximum I/O size is set to 512KB and the
memory is highly fragmented, resulting in each bio containing only
one 2-pages bio_vec (i.e., bi_size = 1028KB). This would cause the
bio to be split into two 512KB portions and one 4KB portion. As a
result, the originally expected continuous large I/O operations are
interspersed with many small I/O operations.

To address this issue, this patch adds a check for the max_sectors
before submitting the bio. This allows the upper layers to proactively
detect and handle alignment issues.

I performed the Antutu V10 Storage Test on a UFS 4.0 device, which
resulted in a significant improvement in the Sequential test:

Sequential Read (average of 5 rounds):
Original: 3033.7 MB/sec
Patched: 3520.9 MB/sec

Sequential Write (average of 5 rounds):
Original: 2225.4 MB/sec
Patched: 2800.3 MB/sec

Signed-off-by: Ed Tsai <ed.tsai@...iatek.com>
---
 block/bio.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/block/bio.c b/block/bio.c
index 816d412c06e9..a4a1f775b9ea 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1227,6 +1227,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 	iov_iter_extraction_t extraction_flags = 0;
 	unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt;
 	unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt;
+	struct queue_limits *lim = &bdev_get_queue(bio->bi_bdev)->limits;
 	struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt;
 	struct page **pages = (struct page **)bv;
 	ssize_t size, left;
@@ -1275,6 +1276,11 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
 		struct page *page = pages[i];
 
 		len = min_t(size_t, PAGE_SIZE - offset, left);
+		if (bio->bi_iter.bi_size + len >
+		    lim->max_sectors << SECTOR_SHIFT) {
+			ret = left;
+			break;
+		}
 		if (bio_op(bio) == REQ_OP_ZONE_APPEND) {
 			ret = bio_iov_add_zone_append_page(bio, page, len,
 					offset);
-- 
2.18.0

Powered by blists - more mailing lists