[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9e5d9952-0295-40b2-5f4b-a1412cc933ce@samsung.com>
Date: Fri, 24 Mar 2023 18:07:26 +0100
From: Marek Szyprowski <m.szyprowski@...sung.com>
To: Krzysztof Kozlowski <krzysztof.kozlowski@...aro.org>,
Rob Herring <robh+dt@...nel.org>,
Krzysztof Kozlowski <krzysztof.kozlowski+dt@...aro.org>,
Alim Akhtar <alim.akhtar@...sung.com>,
Kukjin Kim <kgene@...nel.org>, devicetree@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org,
linux-samsung-soc@...r.kernel.org, linux-kernel@...r.kernel.org,
Chanwoo Choi <cw00.choi@...sung.com>
Cc: replicant@...osl.org, phone-devel@...r.kernel.org,
~postmarketos/upstreaming@...ts.sr.ht,
Martin Jücker <martin.juecker@...il.com>,
Henrik Grimler <henrik@...mler.se>
Subject: Re: [PATCH 5/9] ARM: dts: exynos: move exynos-bus nodes out of soc
in Exynos4412
On 06.02.2023 17:12, Krzysztof Kozlowski wrote:
> On 03/02/2023 23:50, Marek Szyprowski wrote:
>> On 03.02.2023 22:12, Krzysztof Kozlowski wrote:
>>> On 03/02/2023 21:34, Krzysztof Kozlowski wrote:
>>>> On 03/02/2023 12:51, Marek Szyprowski wrote:
>>>>> On 03.02.2023 12:46, Krzysztof Kozlowski wrote:
>>>>>> On 03/02/2023 12:45, Marek Szyprowski wrote:
>>>>>>> On 29.01.2023 11:42, Krzysztof Kozlowski wrote:
>>>>>>>> On 25/01/2023 10:45, Krzysztof Kozlowski wrote:
>>>>>>>>> The soc node is supposed to have only device nodes with MMIO addresses,
>>>>>>>>> as reported by dtc W=1:
>>>>>>>>>
>>>>>>>>> exynos4412.dtsi:407.20-413.5:
>>>>>>>>> Warning (simple_bus_reg): /soc/bus-acp: missing or empty reg/ranges property
>>>>>>>>>
>>>>>>>>> and dtbs_check:
>>>>>>>>>
>>>>>>>>> exynos4412-i9300.dtb: soc: bus-acp:
>>>>>>>>> {'compatible': ['samsung,exynos-bus'], 'clocks': [[7, 456]], 'clock-names': ['bus'], 'operating-points-v2': [[132]], 'status': ['okay'], 'devfreq': [[117]]} should not be valid under {'type': 'object'}
>>>>>>>>>
>>>>>>>>> Move the bus nodes and their OPP tables out of SoC to fix this.
>>>>>>>>> Re-order them alphabetically while moving and put some of the OPP tables
>>>>>>>>> in device nodes (if they are not shared).
>>>>>>>>>
>>>>>>>> Applied.
>>>>>>> I don't have a good news. It looks that this change is responsible for
>>>>>>> breaking boards that were rock-stable so far, like Odroid U3. I didn't
>>>>>>> manage to analyze what exactly causes the issue, but it looks that the
>>>>>>> exynos-bus devfreq driver somehow depends on the order of the nodes:
>>>>>>>
>>>>>>> (before)
>>>>>>>
>>>>>>> # dmesg | grep exynos-bus
>>>>>>> [ 6.415266] exynos-bus: new bus device registered: soc:bus-dmc
>>>>>>> (100000 KHz ~ 400000 KHz)
>>>>>>> [ 6.422717] exynos-bus: new bus device registered: soc:bus-acp
>>>>>>> (100000 KHz ~ 267000 KHz)
>>>>>>> [ 6.454323] exynos-bus: new bus device registered: soc:bus-c2c
>>>>>>> (100000 KHz ~ 400000 KHz)
>>>>>>> [ 6.489944] exynos-bus: new bus device registered: soc:bus-leftbus
>>>>>>> (100000 KHz ~ 200000 KHz)
>>>>>>> [ 6.493990] exynos-bus: new bus device registered: soc:bus-rightbus
>>>>>>> (100000 KHz ~ 200000 KHz)
>>>>>>> [ 6.494612] exynos-bus: new bus device registered: soc:bus-display
>>>>>>> (160000 KHz ~ 200000 KHz)
>>>>>>> [ 6.494932] exynos-bus: new bus device registered: soc:bus-fsys
>>>>>>> (100000 KHz ~ 134000 KHz)
>>>>>>> [ 6.495246] exynos-bus: new bus device registered: soc:bus-peri (
>>>>>>> 50000 KHz ~ 100000 KHz)
>>>>>>> [ 6.495577] exynos-bus: new bus device registered: soc:bus-mfc
>>>>>>> (100000 KHz ~ 200000 KHz)
>>>>>>>
>>>>>>> (after)
>>>>>>>
>>>>>>> # dmesg | grep exynos-bus
>>>>>>>
>>>>>>> [ 6.082032] exynos-bus: new bus device registered: bus-dmc (100000
>>>>>>> KHz ~ 400000 KHz)
>>>>>>> [ 6.122726] exynos-bus: new bus device registered: bus-leftbus
>>>>>>> (100000 KHz ~ 200000 KHz)
>>>>>>> [ 6.146705] exynos-bus: new bus device registered: bus-mfc (100000
>>>>>>> KHz ~ 200000 KHz)
>>>>>>> [ 6.181632] exynos-bus: new bus device registered: bus-peri ( 50000
>>>>>>> KHz ~ 100000 KHz)
>>>>>>> [ 6.204770] exynos-bus: new bus device registered: bus-rightbus
>>>>>>> (100000 KHz ~ 200000 KHz)
>>>>>>> [ 6.211087] exynos-bus: new bus device registered: bus-acp (100000
>>>>>>> KHz ~ 267000 KHz)
>>>>>>> [ 6.216936] exynos-bus: new bus device registered: bus-c2c (100000
>>>>>>> KHz ~ 400000 KHz)
>>>>>>> [ 6.225748] exynos-bus: new bus device registered: bus-display
>>>>>>> (160000 KHz ~ 200000 KHz)
>>>>>>> [ 6.242978] exynos-bus: new bus device registered: bus-fsys (100000
>>>>>>> KHz ~ 134000 KHz)
>>>>>>>
>>>>>>> This is definitely a driver bug, but so far it worked fine, so this is a
>>>>>>> regression that need to be addressed somehow...
>>>>>> Thanks for checking, but what is exactly the bug? The devices registered
>>>>>> - just with different name.
>>>>> The bug is that the board fails to boot from time to time, freezing
>>>>> after registering PPMU counters...
>>>> My U3 with and without this patch, reports several warnings:
>>>> iommu_group_do_set_platform_dma()
>>>> exynos_iommu_domain_free()
>>>> clk_core_enable()
>>>>
>>>> and finally:
>>>> rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
>>>>
>>>> and keeps stalling.
>>>>
>>>> At least on next-20230203. Except all these (which anyway make board
>>>> unbootable) look fine around PMU and exynos-bus.
>>> I also booted few times my next/dt branch (with this patch) and no
>>> problems. How reproducible is the issue you experience?
>> IOMMU needs a fixup, that has been merged today:
>>
>> https://lore.kernel.org/all/20230123093102.12392-1-m.szyprowski@samsung.com/
>>
>> I was initially convinced that this freeze is somehow related to this
>> IOMMU fixup, but it turned out that the devfreq is a source of the problems.
>>
>> The freeze happens here about 1 of 10 boots, usually with kernel
>> compiled from multi_v7_defconfig, while loading the PPMU modules. It
>> happens on your next/dt branch too.
> I was able to reproduce it easily with multi_v7. Then I commented out
> dmc bus which fixed the issue. Then I commented out acp and c2c buses
> (children/passive) which also fixed the issue. Then I uncommented
> everything and went back to next/dt - exactly the same as it was failing
> - and since then I cannot reproduce it. I triple checked, but now my
> multi_v7 on U3 on next/dt boots perfectly fine. Every time.
This issue still happens from time to time. I quick workaround to fix it
is to add:
MODULE_SOFTDEP("pre: exynos_ppmu");
to the exynos-bus driver. Is it acceptable solution?
Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland
Powered by blists - more mailing lists