lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241114113803.3571128-1-panikiel@google.com>
Date: Thu, 14 Nov 2024 11:38:03 +0000
From: "Paweł Anikiel" <panikiel@...gle.com>
To: bob.beckett@...labora.com
Cc: axboe@...nel.dk, hch@....de, kbusch@...nel.org, kernel@...labora.com, 
	linux-kernel@...r.kernel.org, linux-nvme@...ts.infradead.org, 
	sagi@...mberg.me
Subject: Re: [PATCH] nvme-pci: 512 byte aligned dma pool segment quirk

Hi all,

I've been tracking down an issue that seems to be related (identical?) to
this one, and I would like to propose a different fix.

I have a device with the aforementioned NVMe-eMMC bridge, and I was
experiencing nvme read timeouts after updating the kernel from 5.15 to
6.6. Doing a kernel bisect, I arrived at the same dma pool commit as
Robert in the original thread.

After trying out some changes in the nvme-pci driver, I came up with the
same fix as in this thread: change the alignment of the small pool to
512. However, I wanted to get a deeper understanding of what's going on.

After a lot of analysis, I found out why the nvme timeouts were happening:
The bridge incorrectly implements PRP list chaining.

When doing a read of exactly 264 sectors, and allocating a PRP list with
offset 0xf00, the last PRP entry in that list lies right before a page
boundary.  The bridge incorrectly (?) assumes that it's a pointer to a
chained PRP list, tries to do a DMA to address 0x0, gets a bus error,
and crashes.

When doing a write of 264 sectors with PRP list offset of 0xf00,
the bridge treats data as a pointer, and writes incorrect data to
the drive. This might be why Robert is experiencing fs corruption.

So if my findings are right, the correct quirk would be "don't make PRP
lists ending on a page boundary".

Changing the small dma pool alignment to 512 happens to fix the issue
because it never allocates a PRP list with offset 0xf00. Theoretically,
the issue could still happen with the page pool, but this bridge has
a max transfer size of 64 pages, which is not enough to fill an entire
page-sized PRP list.

Robert, could you check that the fs corruption happens only after
transfers of size 257-264 and PRP list offset of 0xf00? This would
confirm my theory.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ