[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <DFVBKRQ35CC0.95P329BK5KZA@kernel.org>
Date: Thu, 22 Jan 2026 19:12:25 +0100
From: "Danilo Krummrich" <dakr@...nel.org>
To: "Gui-Dong Han" <hanguidong02@...il.com>
Cc: "Jon Hunter" <jonathanh@...dia.com>, "Marek Szyprowski"
<m.szyprowski@...sung.com>, "Mark Brown" <broonie@...nel.org>,
<gregkh@...uxfoundation.org>, <rafael@...nel.org>,
<linux-kernel@...r.kernel.org>, <baijiaju1990@...il.com>, "Qiu-ji Chen"
<chenqiuji666@...il.com>, <Aishwarya.TCV@....com>,
"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v5] driver core: enforce device_lock for
driver_match_device()
On Thu Jan 22, 2026 at 6:55 PM CET, Gui-Dong Han wrote:
> On Fri, Jan 23, 2026 at 1:28 AM Jon Hunter <jonathanh@...dia.com> wrote:
>>
>> Hi Danilo,
>>
>> On 21/01/2026 21:42, Danilo Krummrich wrote:
>> > On Wed Jan 21, 2026 at 9:00 PM CET, Jon Hunter wrote:
>> >> It is odd because it only appears to impact the Tegra194 Jetson Xavier
>> >> NX board (tegra194-p3509-0000+p3668-0000.dts).
>> >>
>> >> It appears to boot enough so the test can SSH into the device, but the
>> >> kernel log does not show the us getting to the console prompt. It also
>> >> appears that a lot of drivers are not bound as expected. I would need to
>> >> check if those are all modules or not.
>> >
>> > The other reports were fixed by [1], but the issue in arm-smmu-qcom shouldn't be
>> > related in this case.
>> >
>> > I quickyl checked all drivers with "tegra194" in their compatible string, but
>> > didn't see anything odd.
>> >
>> > Can you please try to enable CONFIG_LOCKDEP, CONFIG_PROVE_LOCKING,
>> > CONFIG_DEBUG_MUTEXES and see if you get a lockdep splat using the following
>> > diff?
>> >
>> > (You will see a lockdep warning in faux_bus_init(), it's harmless and can be
>> > ignored.)
>>
>> Thanks. I do the lockdep warning in faux_bus_init() but that's the only
>> one. I have verified that all these CONFIGs are correctly enabled in the
>> build. The device boots fine with the below diff, but I am guessing that
>> that is expected?
Yes, that's expected, we not actually taking the lock, but assert to lockdep
that we did. The fact that we use a dynamic lock class key for each device mutex
to avoid false positives should also be fine.
>> Any other thoughts?
With this diff, if I intentionally create a deadlock condition on my machine, I
do see a lockdep splat as expected.
Anyways, another option would be to attach a hardware debugger (I assume you
have TRACE32 or something available?) and then get a backtrace from the CPU
affected of the deadlock.
> Can you please try applying the following commit?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git/commit/?h=driver-core-linus&id=ed1ac3c977dd6b119405fa36dd41f7151bd5b4de
>
> Robin Murphy confirmed that the qcom specific issue might actually
> impact other hardware platforms (provided ARM_SMMU_QCOM/ARCH_QCOM is
> enabled), as the implementation init code is still executed:
>
> https://lore.kernel.org/driver-core/d2ddbb72-30a8-44da-b761-876b2d37567e@arm.com/
>
> So, this patch might fix the issue on Tegra as well.
I thought of that as well, but looking at the code in arm_smmu_impl_init(), it
seems that can't happen?
if (of_device_is_compatible(np, "nvidia,tegra234-smmu") ||
of_device_is_compatible(np, "nvidia,tegra194-smmu") ||
of_device_is_compatible(np, "nvidia,tegra186-smmu"))
return nvidia_smmu_impl_init(smmu);
if (IS_ENABLED(CONFIG_ARM_SMMU_QCOM))
smmu = qcom_smmu_impl_init(smmu);
But maybe there is some odd case where the first if condition does not evaluate
to true on tegra194, so maybe worth a try.
Powered by blists - more mailing lists