lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aTj-8-_tHHY7q5C0@kbusch-mbp>
Date: Wed, 10 Dec 2025 14:02:43 +0900
From: Keith Busch <kbusch@...nel.org>
To: Sebastian Ott <sebott@...hat.com>
Cc: linux-nvme@...ts.infradead.org, iommu@...ts.linux.dev,
	linux-block@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-xfs@...r.kernel.org, Jens Axboe <axboe@...com>,
	Christoph Hellwig <hch@....de>, Will Deacon <will@...nel.org>,
	Robin Murphy <robin.murphy@....com>,
	Carlos Maiolino <cem@...nel.org>
Subject: Re: WARNING: drivers/iommu/io-pgtable-arm.c:639

On Tue, Dec 09, 2025 at 12:43:31PM +0100, Sebastian Ott wrote:
> got the following warning after a kernel update on Thurstday, leading to a
> panic and fs corruption. I didn't capture the first warning but I'm pretty
> sure it was the same. It's reproducible but I didn't bisect since it
> borked my fs. The only hint I can give is that v6.18 worked. Is this a
> known issue? Anything I should try?

Could you check if your nvme device supports SGLs? There are some new
features in 6.19 that would allow merging IO that wouldn't have happened
before. You can check from command line:

  # nvme id-ctrl /dev/nvme0 | grep sgl

Replace "nvme0" with whatever your instance was named if it's not using
the 0 suffix.

What I'm thinking happened is that you had an IO that could be coalesced
in IOVA space at one point, and then when that request was completed and
later reused. The new request merged bio's that could not coalesce, and
the problem with that is that we never reinitialize the iova state, so
we're using the old context. And if that is what's happening, here's a
quick fix:

---
diff --git a/block/blk-mq-dma.c b/block/blk-mq-dma.c
index e9108ccaf4b06..7bff480d666e2 100644
--- a/block/blk-mq-dma.c
+++ b/block/blk-mq-dma.c
@@ -199,6 +199,7 @@ static bool blk_dma_map_iter_start(struct request *req, struct device *dma_dev,
 	if (blk_can_dma_map_iova(req, dma_dev) &&
 	    dma_iova_try_alloc(dma_dev, state, vec.paddr, total_len))
 		return blk_rq_dma_map_iova(req, dma_dev, state, iter, &vec);
+	state->__size = 0;
 	return blk_dma_map_direct(req, dma_dev, iter, &vec);
 }

--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ