lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 13 Apr 2017 22:45:10 +0800
From:   Ming Lei <ming.lei@...hat.com>
To:     Johannes Thumshirn <jthumshirn@...e.de>
Cc:     Jens Axboe <axboe@...com>, Omar Sandoval <osandov@...ndov.com>,
        Bart Van Assche <Bart.VanAssche@...disk.com>,
        Hannes Reinecke <hare@...e.de>, Christoph Hellwig <hch@....de>,
        Linux Block Layer Mailinglist <linux-block@...r.kernel.org>,
        Linux Kernel Mailinglist <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] block: bios with an offset are always gappy

On Thu, Apr 13, 2017 at 01:53:28PM +0200, Johannes Thumshirn wrote:
> On Thu, Apr 13, 2017 at 06:02:21PM +0800, Ming Lei wrote:
> > On Thu, Apr 13, 2017 at 10:06:29AM +0200, Johannes Thumshirn wrote:
> > > Doing a mkfs.btrfs on a (qemu emulated) PCIe NVMe causes a kernel panic
> > > in nvme_setup_prps() because the dma_len will drop below zero but the
> > > length not.
> > 
> > Looks I can't reproduce the issue in QEMU(32G nvme, either partitioned
> > or not, just use 'mkfs.btrfs /dev/nvme0n1p1'), could you share the exact
> > mkfs command line and size of your emulated NVMe?
> 
> the exact cmdline is mkfs.btrfs -f /dev/nvme0n1p1 (-f because there was a
> existing btrfs on the image). The image is 17179869184 (a.k.a 16G) bytes.
> 
> [...]
> 
> > Could you try the following patch to see if it fixes your issue?
> 
> It's back to the old, erratic behaviour, see log below.

Johannes, could you test the following patch?

Thanks
Ming

---

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b75d6fe5a1b9..cc68dfaef528 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1672,12 +1672,27 @@ static inline bool bios_segs_mergeable(struct request_queue *q,
 	return true;
 }
 
-static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
-			 struct bio *next)
+static inline bool bio_will_gap(struct request_queue *q,
+				struct request *prev_rq,
+				struct bio *prev,
+				struct bio *next)
 {
 	if (bio_has_data(prev) && queue_virt_boundary(q)) {
 		struct bio_vec pb, nb;
 
+		/*
+		 * don't merge if the 1st bio starts with non-zero
+		 * offset, otherwise it is quite difficult to respect
+		 * sg gap limit. We work hard to merge huge of small
+		 * bios in case of mkfs.
+		 */
+		if (prev_rq)
+			bio_get_first_bvec(prev_rq->bio, &pb);
+		else
+			bio_get_first_bvec(prev, &pb);
+		if (pb.bv_offset)
+			return true;
+
 		bio_get_last_bvec(prev, &pb);
 		bio_get_first_bvec(next, &nb);
 
@@ -1690,12 +1705,12 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
 
 static inline bool req_gap_back_merge(struct request *req, struct bio *bio)
 {
-	return bio_will_gap(req->q, req->biotail, bio);
+	return bio_will_gap(req->q, req, req->biotail, bio);
 }
 
 static inline bool req_gap_front_merge(struct request *req, struct bio *bio)
 {
-	return bio_will_gap(req->q, bio, req->bio);
+	return bio_will_gap(req->q, NULL, bio, req->bio);
 }
 
 static inline void blk_dump_rq(const struct request *req, const char *msg)

Powered by blists - more mailing lists