[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d3f008bf-a9cb-43a8-807b-2ba1d6d9ff3c@arm.com>
Date: Fri, 16 Jan 2026 17:35:34 +0000
From: Robin Murphy <robin.murphy@....com>
To: Jason Gunthorpe <jgg@...dia.com>,
Nicolas Cavallari <Nicolas.Cavallari@...en-communications.fr>
Cc: iommu@...ts.linux.dev, linux-pci@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
Bjorn Helgaas <bhelgaas@...gle.com>, "Rob Herring (Arm)" <robh@...nel.org>,
Lorenzo Pieralisi <lpieralisi@...nel.org>, Joerg Roedel <jroedel@...e.de>,
regressions@...ts.linux.dev
Subject: Re: [REGRESSION] Re: imx8 PCI regression since "iommu: Get DT/ACPI
parsing into the proper probe path"
On 2026-01-16 5:10 pm, Jason Gunthorpe wrote:
> On Fri, Jan 16, 2026 at 05:52:36PM +0100, Nicolas Cavallari wrote:
>> I debugged it further, it seems to be mostly a PCI issue since the system
>> does not actually have an IOMMU.
>>
>> When examining changes in the PCI configuration (lspci -vvvv), the main
>> difference is that, with the patch, Access Control Services are enabled on
>> the PCI switch.
>>
>> Capabilities: [220 v1] Access Control Services
>> ACSCap: SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+
>> UpstreamFwd+ EgressCtrl+ DirectTrans+
>> - ACSCtl: SrcValid- TransBlk- ReqRedir- CmpltRedir-
>> UpstreamFwd- EgressCtrl- DirectTrans-
>> + ACSCtl: SrcValid+ TransBlk- ReqRedir+ CmpltRedir+
>> UpstreamFwd+ EgressCtrl- DirectTrans-
>>
>> If I manually patch the config space in sysfs and re-disable ACS on the port
>> connected to the LAN7430, I cannot reproduce the problem. In fact,
>> disabling only ReqRedir is enough to work around the issue.
>
> My guess would be your system has some kind of address alias going on?
>
> Assuming you are not facing an errata, ACS generally changes the
> routing of TLPs so if you have a DMA address that could go to two
> different places then messing with ACS will give you different
> behaviors.
>
> In specific when you turn all those ACS settings you cannot do P2P
> traffic anymore. If your system expects this for some reason then you
> must use the kernel command line option to disable acs.
>
> If you are just doing normal netdev stuff then it is doubtful that you
> are doing P2P at all, so I might guess a bug in the microchip ethernet
> driver doing a wild DMA? Stricter ACS settings cause it to AER and the
> device cannot recover?
>
> It will be hard to get the bottom of the defect without a PCI trace
>
> I don't know why your bisection landed on bcb8 - the intention was
> that pci_enable_acs() is always called, and I didn't notice an obvious
> reason why that wouldn't happen prior to bcb8.. It is called directly
> from pci_device_add() Maybe investigating that angle would be
> informative..
The difference is that bcb8 moves the pci_request_acs() call on OF
systems back early enough to actually have an effect - that's spent the
last 6 years being pretty much a no-op since 6bf6c24720d3 ("iommu/of:
Request ACS from the PCI core when configuring IOMMU linkage")...
Thanks,
Robin.
>> I also read up on AER and I'm surprised that I don't see anything in dmesg
>> when the problem occurs, even through UERcvd+ start appearing on the root
>> context and AdvNonFatalErr+ appears on the switch.
>
> Though UE and AdvNonFatalErr sure are weird indications for an
> addressing error.. Is there some kind of special embedded system thing
> going on? Vendor messages over PCI perhaps?
>
> Jason
Powered by blists - more mailing lists