[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4da29d46-9a80-4ec4-b6b8-6c9457eed481@arm.com>
Date: Fri, 16 Jan 2026 17:24:46 +0000
From: Robin Murphy <robin.murphy@....com>
To: Nicolas Cavallari <Nicolas.Cavallari@...en-communications.fr>,
iommu@...ts.linux.dev, linux-pci@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, "Rob Herring (Arm)"
<robh@...nel.org>, Lorenzo Pieralisi <lpieralisi@...nel.org>,
Jason Gunthorpe <jgg@...dia.com>, Joerg Roedel <jroedel@...e.de>,
regressions@...ts.linux.dev
Subject: Re: [REGRESSION] Re: imx8 PCI regression since "iommu: Get DT/ACPI
parsing into the proper probe path"
On 2026-01-16 4:52 pm, Nicolas Cavallari wrote:
> +cc regressions ML
>
> Le 13/01/2026 à 10:17, Nicolas Cavallari a écrit :
>> +cc patch author & reviewers
>>
>> On 1/9/26 17:22, Nicolas Cavallari wrote:
>>> When upgrading from 6.12 to a 6.18 kernel, I noticed that a PCI
>>> Ethernet adapter (Microchip LAN7430) would hang under load and not
>>> recover. When that happens, some of its registers indicate it is
>>> failing to do DMA reads, so cannot reclaim entries on its ring buffer.
>>>
>>> I bisected the problem into this commit:
>>>
>>> commit bcb81ac6ae3c2ef95b44e7b54c3c9522364a245c
>>> Author: Robin Murphy <robin.murphy@....com>
>>> Date: Fri Feb 28 15:46:33 2025 +0000
>>>
>>> iommu: Get DT/ACPI parsing into the proper probe path
>>>
>>> The problem still exists on 6.19-rc1, on pci/next (29a77b4897f1) and on
>>> iommu/master (360e85353769) trees. Reverting the commit fixes the
>>> issue.
>
> The problem persists on 6.19-rc5
>
>>> The system is a Gateworks GW7200, which is a i.MX 8 Mini connected to a
>>> Pericom
>>> PI7C9X2G404 4-port switch connected to the LAN7430 chip.
>>>
>>> -[0000:00]---00.0-[01-ff]----00.0-[02-05]--+-01.0-[03]----00.0
>>> +-02.0-[04]--
>>> \-03.0-[05]----00.0
>>>
>>> The problem only occurs when there is at least another PCI device in use
>>> on the
>>> switch. It does not happen if the LAN7430 is the only PCI device, or if
>>> the
>>> other devices are not actively used. For example i can reproduce it
>>> with an
>>> ath9k wireless network adapter when it is up and running, but not when
>>> it is
>>> down or its driver is not loaded.
>>>
>>> I suspect that other PCI devices have similar issues, but the LAN7430 is
>>> the
>>> easiest one to diagnose, as it hangs within seconds with an iperf3 --
>>> bidir -u
>>> -b 200M and its register map are public.
>>>
>>> I couldn't find an way to dump the PCI address translation mapping from
>>> userspace.
>>> I would be happy to provide more information or test patches.
>
> I debugged it further, it seems to be mostly a PCI issue since the
> system does not actually have an IOMMU.
Indeed, I was figuring this had to be another case of a switch with
wonky ACS - do Mani's patches adjusting ACS enablement make any difference?
https://lore.kernel.org/all/20260102-pci_acs-v3-1-72280b94d288@oss.qualcomm.com/
Although in this case I guess the issue is arguably more that we're
requesting ACS at all, before we know that there's actually an IOMMU
present to warrant it. Clearly the best option would be to figure out if
the switch behaviour itself can be fixed somehow, but perhaps something
like this might help paper over the issue for now (but I'd have to test
it to make sure it doesn't break IOMMUs again...)
----->8-----
diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index 6b989a62def2..837cc0b5ace4 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -141,10 +141,12 @@ int of_iommu_configure(struct device *dev, struct
device_node *master_np,
.np = master_np,
};
- pci_request_acs();
err = pci_for_each_dma_alias(to_pci_dev(dev),
of_pci_iommu_init, &info);
- of_pci_check_device_ats(dev, master_np);
+ if (!err) {
+ pci_request_acs();
+ of_pci_check_device_ats(dev, master_np);
+ }
} else {
err = of_iommu_configure_device(master_np, dev, id);
}
-----8<-----
Thanks,
Robin.
Powered by blists - more mailing lists