[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b8b215e6-7447-4fbb-a408-20e518c8da4c@linux.intel.com>
Date: Fri, 21 Feb 2025 16:27:22 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: "Tian, Kevin" <kevin.tian@...el.com>, Joerg Roedel <joro@...tes.org>,
Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>
Cc: baolu.lu@...ux.intel.com, Ido Schimmel <idosch@...sch.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
"stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH 1/1] iommu/vt-d: Fix suspicious RCU usage
On 2025/2/21 15:22, Tian, Kevin wrote:
>> From: Baolu Lu<baolu.lu@...ux.intel.com>
>> Sent: Thursday, February 20, 2025 7:38 PM
>>
>> On 2025/2/20 15:21, Tian, Kevin wrote:
>>>> From: Lu Baolu<baolu.lu@...ux.intel.com>
>>>> Sent: Tuesday, February 18, 2025 10:24 AM
>>>>
>>>> Commit <d74169ceb0d2> ("iommu/vt-d: Allocate DMAR fault interrupts
>>>> locally") moved the call to enable_drhd_fault_handling() to a code
>>>> path that does not hold any lock while traversing the drhd list. Fix
>>>> it by ensuring the dmar_global_lock lock is held when traversing the
>>>> drhd list.
>>>>
>>>> Without this fix, the following warning is triggered:
>>>> =============================
>>>> WARNING: suspicious RCU usage
>>>> 6.14.0-rc3 #55 Not tainted
>>>> -----------------------------
>>>> drivers/iommu/intel/dmar.c:2046 RCU-list traversed in non-reader section!!
>>>> other info that might help us debug this:
>>>> rcu_scheduler_active = 1, debug_locks = 1
>>>> 2 locks held by cpuhp/1/23:
>>>> #0: ffffffff84a67c50 (cpu_hotplug_lock){++++}-{0:0}, at:
>>>> cpuhp_thread_fun+0x87/0x2c0
>>>> #1: ffffffff84a6a380 (cpuhp_state-up){+.+.}-{0:0}, at:
>>>> cpuhp_thread_fun+0x87/0x2c0
>>>> stack backtrace:
>>>> CPU: 1 UID: 0 PID: 23 Comm: cpuhp/1 Not tainted 6.14.0-rc3 #55
>>>> Call Trace:
>>>> <TASK>
>>>> dump_stack_lvl+0xb7/0xd0
>>>> lockdep_rcu_suspicious+0x159/0x1f0
>>>> ? __pfx_enable_drhd_fault_handling+0x10/0x10
>>>> enable_drhd_fault_handling+0x151/0x180
>>>> cpuhp_invoke_callback+0x1df/0x990
>>>> cpuhp_thread_fun+0x1ea/0x2c0
>>>> smpboot_thread_fn+0x1f5/0x2e0
>>>> ? __pfx_smpboot_thread_fn+0x10/0x10
>>>> kthread+0x12a/0x2d0
>>>> ? __pfx_kthread+0x10/0x10
>>>> ret_from_fork+0x4a/0x60
>>>> ? __pfx_kthread+0x10/0x10
>>>> ret_from_fork_asm+0x1a/0x30
>>>> </TASK>
>>>>
>>>> Simply holding the lock in enable_drhd_fault_handling() will trigger a
>>>> lock order splat. Avoid holding the dmar_global_lock when calling
>>>> iommu_device_register(), which starts the device probe process.
>>> Can you elaborate the splat issue? It's not intuitive to me with a quick
>>> read of the code and iommu_device_register() is not occurred in above
>>> calling stack.
>> The lockdep splat looks like below:
> Thanks and it's clear now. Probably you can expand "to avoid unnecessary
> lock order splat " a little bit to mark the dead lock between dmar_global_lock
> and cpu_hotplug_lock (acquired in path of iommu_device_register()).
Yes, sure.
Powered by blists - more mailing lists