[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cdc333e4-25bb-4171-9f6e-01f1de947db3@samsung.com>
Date: Mon, 17 Mar 2025 08:37:04 +0100
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Robin Murphy <robin.murphy@....com>, Lorenzo Pieralisi
<lpieralisi@...nel.org>, Hanjun Guo <guohanjun@...wei.com>, Sudeep Holla
<sudeep.holla@....com>, "Rafael J. Wysocki" <rafael@...nel.org>, Len Brown
<lenb@...nel.org>, Russell King <linux@...linux.org.uk>, Greg Kroah-Hartman
<gregkh@...uxfoundation.org>, Danilo Krummrich <dakr@...nel.org>, Stuart
Yoder <stuyoder@...il.com>, Laurentiu Tudor <laurentiu.tudor@....com>, Nipun
Gupta <nipun.gupta@....com>, Nikhil Agarwal <nikhil.agarwal@....com>, Joerg
Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>, Rob Herring
<robh@...nel.org>, Saravana Kannan <saravanak@...gle.com>, Bjorn Helgaas
<bhelgaas@...gle.com>
Cc: linux-acpi@...r.kernel.org, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
devicetree@...r.kernel.org, linux-pci@...r.kernel.org, Charan Teja Kalla
<quic_charante@...cinc.com>
Subject: Re: [PATCH v2 4/4] iommu: Get DT/ACPI parsing into the proper probe
path
On 13.03.2025 15:12, Robin Murphy wrote:
> On 2025-03-13 1:06 pm, Robin Murphy wrote:
>> On 2025-03-13 12:23 pm, Marek Szyprowski wrote:
>>> On 13.03.2025 12:01, Robin Murphy wrote:
>>>> On 2025-03-13 9:56 am, Marek Szyprowski wrote:
>>>> [...]
>>>>> This patch landed in yesterday's linux-next as commit bcb81ac6ae3c
>>>>> ("iommu: Get DT/ACPI parsing into the proper probe path"). In my
>>>>> tests I
>>>>> found it breaks booting of ARM64 RK3568-based Odroid-M1 board
>>>>> (arch/arm64/boot/dts/rockchip/rk3568-odroid-m1.dts). Here is the
>>>>> relevant kernel log:
>>>>
>>>> ...and the bug-flushing-out begins!
>>>>
>>>>> Unable to handle kernel NULL pointer dereference at virtual address
>>>>> 00000000000003e8
>>>>> Mem abort info:
>>>>> ESR = 0x0000000096000004
>>>>> EC = 0x25: DABT (current EL), IL = 32 bits
>>>>> SET = 0, FnV = 0
>>>>> EA = 0, S1PTW = 0
>>>>> FSC = 0x04: level 0 translation fault
>>>>> Data abort info:
>>>>> ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000
>>>>> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>>>>> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>>>>> [00000000000003e8] user address but active_mm is swapper
>>>>> Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP
>>>>> Modules linked in:
>>>>> CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc3+ #15533
>>>>> Hardware name: Hardkernel ODROID-M1 (DT)
>>>>> pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>> pc : devm_kmalloc+0x2c/0x114
>>>>> lr : rk_iommu_of_xlate+0x30/0x90
>>>>> ...
>>>>> Call trace:
>>>>> devm_kmalloc+0x2c/0x114 (P)
>>>>> rk_iommu_of_xlate+0x30/0x90
>>>>
>>>> Yeah, looks like this is doing something a bit questionable which
>>>> can't
>>>> work properly. TBH the whole dma_dev thing could probably be
>>>> cleaned up
>>>> now that we have proper instances, but for now does this work?
>>>
>>> Yes, this patch fixes the problem I've observed.
>>>
>>> Reported-by: Marek Szyprowski <m.szyprowski@...sung.com>
>>> Tested-by: Marek Szyprowski <m.szyprowski@...sung.com>
>>>
>>> BTW, this dma_dev idea has been borrowed from my exynos_iommu driver
>>> and
>>> I doubt it can be cleaned up.
>>
>> On the contrary I suspect they both can - it all dates back to when
>> we had the single global platform bus iommu_ops and the SoC drivers
>> were forced to bodge their own notion of multiple instances, but with
>> the modern core code, ops are always called via a valid IOMMU
>> instance or domain, so in principle it should always be possible to
>> get at an appropriate IOMMU device now. IIRC it was mostly about
>> allocating and DMA-mapping the pagetables in domain_alloc, where the
>> private notion of instances didn't have enough information, but
>> domain_alloc_paging solves that.
>
> Bah, in fact I think I am going to have to do that now, since although
> it doesn't crash, rk_domain_alloc_paging() will also be failing for
> the same reason. Time to find a PSU for the RK3399 board, I guess...
>
> (Or maybe just move the dma_dev assignment earlier to match Exynos?)
Well I just found that Exynos IOMMU is also broken on some on my test
boards. It looks that the runtime pm links are somehow not correctly
established. I will try to analyze this later in the afternoon.
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists