[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZaZ2PIpEId-rl6jv@wantstofly.org>
Date: Tue, 16 Jan 2024 14:27:40 +0200
From: Lennert Buytenhek <kernel@...tstofly.org>
To: Damien Le Moal <dlemoal@...nel.org>, Niklas Cassel <cassel@...nel.org>,
linux-ide@...r.kernel.org
Cc: linux-kernel@...r.kernel.org
Subject: ASMedia ASM1062 (AHCI) hang after "ahci 0000:28:00.0: Using 64-bit
DMA addresses"
Hi,
On kernel 6.6.x, with an ASMedia ASM1062 (AHCI) controller, on an
ASUSTeK Pro WS WRX80E-SAGE SE WIFI mainboard, PCI ID 1b21:0612 and
subsystem ID 1043:858d, I got a total apparent controller hang,
rendering the two attached SATA devices unavailable, that was
immediately preceded by the following kernel messages:
[Thu Jan 4 23:12:54 2024] ahci 0000:28:00.0: Using 64-bit DMA addresses
[Thu Jan 4 23:12:54 2024] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00000 flags=0x0000]
[Thu Jan 4 23:12:54 2024] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00300 flags=0x0000]
[Thu Jan 4 23:12:54 2024] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00380 flags=0x0000]
[Thu Jan 4 23:12:54 2024] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00400 flags=0x0000]
[Thu Jan 4 23:12:54 2024] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00680 flags=0x0000]
[Thu Jan 4 23:12:54 2024] ahci 0000:28:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0035 address=0x7fffff00700 flags=0x0000]
It seems as if the controller has problems with 64-bit DMA addresses,
and the comments around the source of the message in
drivers/iommu/dma-iommu.c seem to point into that same direction:
/*
* Try to use all the 32-bit PCI addresses first. The original SAC vs.
* DAC reasoning loses relevance with PCIe, but enough hardware and
* firmware bugs are still lurking out there that it's safest not to
* venture into the 64-bit space until necessary.
*
* If your device goes wrong after seeing the notice then likely either
* its driver is not setting DMA masks accurately, the hardware has
* some inherent bug in handling >32-bit addresses, or not all the
* expected address bits are wired up between the device and the IOMMU.
*/
if (dma_limit > DMA_BIT_MASK(32) && dev->iommu->pci_32bit_workaround) {
iova = alloc_iova_fast(iovad, iova_len,
DMA_BIT_MASK(32) >> shift, false);
if (iova)
goto done;
dev->iommu->pci_32bit_workaround = false;
dev_notice(dev, "Using %d-bit DMA addresses\n", bits_per(dma_limit));
}
Are there any tests you can think of that I can run to further narrow
down this issue? By itself, the issue reproduces only rarely.
Thank you in advance.
Kind regards,
Lennert
Powered by blists - more mailing lists