[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1e691960-bc52-4e5c-89d7-ecd3f7dc9f56@arm.com>
Date: Tue, 30 Apr 2024 13:33:20 +0100
From: Robin Murphy <robin.murphy@....com>
To: Konrad Dybcio <konrad.dybcio@...aro.org>,
Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
"open list:DRM DRIVER FOR MSM ADRENO GPU" <linux-arm-msm@...r.kernel.org>,
Bjorn Andersson <andersson@...nel.org>
Cc: Joerg Roedel <joro@...tes.org>, Christoph Hellwig <hch@....de>,
Vineet Gupta <vgupta@...nel.org>, Russell King <linux@...linux.org.uk>,
Catalin Marinas <catalin.marinas@....com>, Will Deacon <will@...nel.org>,
Huacai Chen <chenhuacai@...nel.org>, WANG Xuerui <kernel@...0n.name>,
Thomas Bogendoerfer <tsbogend@...ha.franken.de>,
Paul Walmsley <paul.walmsley@...ive.com>, Palmer Dabbelt
<palmer@...belt.com>, Albert Ou <aou@...s.berkeley.edu>,
Lorenzo Pieralisi <lpieralisi@...nel.org>, Hanjun Guo
<guohanjun@...wei.com>, Sudeep Holla <sudeep.holla@....com>,
"K. Y. Srinivasan" <kys@...rosoft.com>,
Haiyang Zhang <haiyangz@...rosoft.com>, Wei Liu <wei.liu@...nel.org>,
Dexuan Cui <decui@...rosoft.com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
David Woodhouse <dwmw2@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>,
Niklas Schnelle <schnelle@...ux.ibm.com>,
Matthew Rosato <mjrosato@...ux.ibm.com>,
Gerald Schaefer <gerald.schaefer@...ux.ibm.com>,
Jean-Philippe Brucker <jean-philippe@...aro.org>,
Rob Herring <robh+dt@...nel.org>, Frank Rowand <frowand.list@...il.com>,
Marek Szyprowski <m.szyprowski@...sung.com>, Jason Gunthorpe <jgg@...pe.ca>,
linux-kernel@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-acpi@...r.kernel.org, iommu@...ts.linux.dev,
devicetree@...r.kernel.org, Jason Gunthorpe <jgg@...dia.com>
Subject: Re: [PATCH v4 6/7] iommu/dma: Centralise iommu_setup_dma_ops()
On 30/04/2024 1:23 pm, Konrad Dybcio wrote:
> On 29.04.2024 11:26 PM, Dmitry Baryshkov wrote:
>> On Mon, 29 Apr 2024 at 19:31, Dmitry Baryshkov
>> <dmitry.baryshkov@...aro.org> wrote:
>>>
>>> On Fri, Apr 19, 2024 at 05:54:45PM +0100, Robin Murphy wrote:
>>>> It's somewhat hard to see, but arm64's arch_setup_dma_ops() should only
>>>> ever call iommu_setup_dma_ops() after a successful iommu_probe_device(),
>>>> which means there should be no harm in achieving the same order of
>>>> operations by running it off the back of iommu_probe_device() itself.
>>>> This then puts it in line with the x86 and s390 .probe_finalize bodges,
>>>> letting us pull it all into the main flow properly. As a bonus this lets
>>>> us fold in and de-scope the PCI workaround setup as well.
>>>>
>>>> At this point we can also then pull the call up inside the group mutex,
>>>> and avoid having to think about whether iommu_group_store_type() could
>>>> theoretically race and free the domain if iommu_setup_dma_ops() ran just
>>>> *before* iommu_device_use_default_domain() claims it... Furthermore we
>>>> replace one .probe_finalize call completely, since the only remaining
>>>> implementations are now one which only needs to run once for the initial
>>>> boot-time probe, and two which themselves render that path unreachable.
>>>>
>>>> This leaves us a big step closer to realistically being able to unpick
>>>> the variety of different things that iommu_setup_dma_ops() has been
>>>> muddling together, and further streamline iommu-dma into core API flows
>>>> in future.
>>>>
>>>> Reviewed-by: Lu Baolu <baolu.lu@...ux.intel.com> # For Intel IOMMU
>>>> Reviewed-by: Jason Gunthorpe <jgg@...dia.com>
>>>> Tested-by: Hanjun Guo <guohanjun@...wei.com>
>>>> Signed-off-by: Robin Murphy <robin.murphy@....com>
>>>> ---
>>>> v2: Shuffle around to make sure the iommu_group_do_probe_finalize() case
>>>> is covered as well, with bonus side-effects as above.
>>>> v3: *Really* do that, remembering the other two probe_finalize sites too.
>>>> ---
>>>> arch/arm64/mm/dma-mapping.c | 2 --
>>>> drivers/iommu/amd/iommu.c | 8 --------
>>>> drivers/iommu/dma-iommu.c | 18 ++++++------------
>>>> drivers/iommu/dma-iommu.h | 14 ++++++--------
>>>> drivers/iommu/intel/iommu.c | 7 -------
>>>> drivers/iommu/iommu.c | 20 +++++++-------------
>>>> drivers/iommu/s390-iommu.c | 6 ------
>>>> drivers/iommu/virtio-iommu.c | 10 ----------
>>>> include/linux/iommu.h | 7 -------
>>>> 9 files changed, 19 insertions(+), 73 deletions(-)
>>>
>>> This patch breaks UFS on Qualcomm SC8180X Primus platform:
>>>
>>>
>>> [ 3.846856] arm-smmu 15000000.iommu: Unhandled context fault: fsr=0x402, iova=0x1032db3e0, fsynr=0x130000, cbfrsynra=0x300, cb=4
>>> [ 3.846880] ufshcd-qcom 1d84000.ufshc: ufshcd_check_errors: saved_err 0x20000 saved_uic_err 0x0
>>> [ 3.846929] host_regs: 00000000: 1587031f 00000000 00000300 00000000
>>> [ 3.846935] host_regs: 00000010: 01000000 00010217 00000000 00000000
>>> [ 3.846941] host_regs: 00000020: 00000000 00070ef5 00000000 00000000
>>> [ 3.846946] host_regs: 00000030: 0000000f 00000001 00000000 00000000
>>> [ 3.846951] host_regs: 00000040: 00000000 00000000 00000000 00000000
>>> [ 3.846956] host_regs: 00000050: 032db000 00000001 00000000 00000000
>>> [ 3.846962] host_regs: 00000060: 00000000 80000000 00000000 00000000
>>> [ 3.846967] host_regs: 00000070: 032dd000 00000001 00000000 00000000
>>> [ 3.846972] host_regs: 00000080: 00000000 00000000 00000000 00000000
>>> [ 3.846977] host_regs: 00000090: 00000016 00000000 00000000 0000000c
>>> [ 3.847074] ufshcd-qcom 1d84000.ufshc: ufshcd_err_handler started; HBA state eh_fatal; powered 1; shutting down 0; saved_err = 131072; saved_uic_err = 0; force_reset = 0
>>> [ 4.406550] ufshcd-qcom 1d84000.ufshc: ufshcd_verify_dev_init: NOP OUT failed -11
>>> [ 4.417953] ufshcd-qcom 1d84000.ufshc: ufshcd_async_scan failed: -11
>>
>> Just to confirm: reverting f091e93306e0 ("dma-mapping: Simplify
>> arch_setup_dma_ops()") and b67483b3c44e ("iommu/dma: Centralise
>> iommu_setup_dma_ops()" fixes the issue for me. Please ping me if you'd
>> like me to test a fix.
>
> This also triggers a different issue (that also comes down to "ufs bad") on
> another QC platform (SM8550):
>
> [ 4.282098] scsi host0: ufshcd
> [ 4.315970] ufshcd-qcom 1d84000.ufs: ufshcd_check_errors: saved_err 0x20000 saved_uic_err 0x0
> [ 4.330155] host_regs: 00000000: 3587031f 00000000 00000400 00000000
> [ 4.343955] host_regs: 00000010: 01000000 00010217 00000000 00000000
> [ 4.356027] host_regs: 00000020: 00000000 00070ef5 00000000 00000000
> [ 4.370136] host_regs: 00000030: 0000000f 00000003 00000000 00000000
> [ 4.376662] host_regs: 00000040: 00000000 00000000 00000000 00000000
> [ 4.383192] host_regs: 00000050: 85109000 00000008 00000000 00000000
> [ 4.389719] host_regs: 00000060: 00000000 80000000 00000000 00000000
> [ 4.396245] host_regs: 00000070: 8510a000 00000008 00000000 00000000
> [ 4.402773] host_regs: 00000080: 00000000 00000000 00000000 00000000
> [ 4.409298] host_regs: 00000090: 00000016 00000000 00000000 0000000c
> [ 4.415900] arm-smmu 15000000.iommu: Unhandled context fault: fsr=0x402, iova=0x8851093e0, fsynr=0x3b0001, cbfrsynra=0x60, cb=2
> [ 4.416135] ufshcd-qcom 1d84000.ufs: ufshcd_err_handler started; HBA state eh_fatal; powered 1; shutting down 0; saved_err = 131072; saved_uic_err = 0; force_reset = 0
> [ 4.951750] ufshcd-qcom 1d84000.ufs: ufshcd_verify_dev_init: NOP OUT failed -11
> [ 4.960644] ufshcd-qcom 1d84000.ufs: ufshcd_async_scan failed: -11
>
> Reverting the commits Dmitry mentioned also fixes this.
Yeah, It'll be the same thing - doesn't really matter exactly *how* the
UFS goes wrong due to the SMMU blocking it, the issue is that the SMMU
is erroneously blocking it in the first place due to a DMA ops mixup.
Fix is now here:
https://lore.kernel.org/linux-iommu/d4cc20cbb0c45175e98dd76bf187e2ad6421296d.1714472573.git.robin.murphy@arm.com/
Thanks,
Robin.
Powered by blists - more mailing lists