linux-kernel - Re: [PATCH] nvme-pci: 512 byte aligned dma pool segment quirk

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Z1i0d5Ht8zUHhSu-@kbusch-mbp.dhcp.thefacebook.com>
Date: Tue, 10 Dec 2024 14:36:55 -0700
From: Keith Busch <kbusch@...nel.org>
To: Pawel Anikiel <panikiel@...gle.com>
Cc: Robert Beckett <bob.beckett@...labora.com>, axboe <axboe@...nel.dk>,
	hch <hch@....de>, kernel <kernel@...labora.com>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-nvme <linux-nvme@...ts.infradead.org>,
	sagi <sagi@...mberg.me>
Subject: Re: [PATCH] nvme-pci: 512 byte aligned dma pool segment quirk

On Mon, Dec 09, 2024 at 04:33:01PM +0100, Paweł Anikiel wrote:
> On Mon, Dec 9, 2024 at 1:33 PM Robert Beckett <bob.beckett@...labora.com> wrote:
> > [...]
> > I have no further updates on this. I have received no further info from the vendor.
> > I think we can go ahead and use the alignment patch as is. The only outstanding question was whether it is an
> > implicit last entry per page chain vs simple alisngment requirement. Either way, using the dmapool
> > alignment fixes both of these potential causes, so we should just take it as is.
> > If we ever get any better info and can do a more specific patch in future, we can rework it then.
> 
> I think the 512 byte alignment fix is good. I tried coming up with
> something more specific, but everything I could think of was either
> too complicated or artificial.
> 
> Regarding the question of whether this is an alignment requirement or
> the last PRP entry issue, I strongly believe it's the latter. I have a
> piece of code that clearly demonstrates the hardware bug when run on a
> device with the nvme bridge. I would really appreciate it if this was
> verified and my explanation was included in the patch.

I've pushed this to nvme-6.13 with an updated commit message here:

  https://git.infradead.org/?p=nvme.git;a=commitdiff;h=ccd84b8d6f4a60626dacb933b5d56dadca75c0ca

I can force an update if you have any edit suggestions.

Commit message copied below:

Author: Robert Beckett <bob.beckett@...labora.com>

nvme-pci: 512 byte aligned dma pool segment quirk

We initially introduced a quick fix limiting the queue depth to 1 as
experimentation showed that it fixed data corruption on 64GB steamdecks.

Further experimentation revealed corruption only happens when the last
PRP data element aligns to the end of the page boundary. The device
appears to treat this as a PRP chain to a new list instead of the data
element that it actually is. This is an implementation is in violation
of the spec. Encountering this errata with the Linux driver requires the
host request a 128k transfer and coincidently get the last small pool
dma buffer within a page.

The QD1 quirk effectly works around this because the last data PRP
always was at a 248 byte offset from the page start, so it never
appeared at the end of the page. Further, the MDTS is also small enough
that the "large" prp pool can hold enough PRP elements to never get to
the end, so that pool is not a problem either.

Introduce a new quirk to ensure the small pool is always aligned such
that the last PRP element can't appear a the end of the page. This comes
at the expense of wasting 256 bytes per small pool page allocated.

Link: https://lore.kernel.org/linux-nvme/20241113043151.GA20077@lst.de/T/#u
Fixes: 83bdfcbdbe5d ("nvme-pci: qdepth 1 quirk")
Cc: Paweł Anikiel <panikiel@...gle.com>
Signed-off-by: Robert Beckett <bob.beckett@...labora.com>
Signed-off-by: Keith Busch <kbusch@...nel.org>