linux-kernel - Re: [PATCH 1/1] iommu/vt-d: Fix suspicious RCU usage

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <b8b215e6-7447-4fbb-a408-20e518c8da4c@linux.intel.com>
Date: Fri, 21 Feb 2025 16:27:22 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: "Tian, Kevin" <kevin.tian@...el.com>, Joerg Roedel <joro@...tes.org>,
 Will Deacon <will@...nel.org>, Robin Murphy <robin.murphy@....com>
Cc: baolu.lu@...ux.intel.com, Ido Schimmel <idosch@...sch.org>,
 "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 "stable@...r.kernel.org" <stable@...r.kernel.org>
Subject: Re: [PATCH 1/1] iommu/vt-d: Fix suspicious RCU usage

On 2025/2/21 15:22, Tian, Kevin wrote:
>> From: Baolu Lu<baolu.lu@...ux.intel.com>
>> Sent: Thursday, February 20, 2025 7:38 PM
>>
>> On 2025/2/20 15:21, Tian, Kevin wrote:
>>>> From: Lu Baolu<baolu.lu@...ux.intel.com>
>>>> Sent: Tuesday, February 18, 2025 10:24 AM
>>>>
>>>> Commit <d74169ceb0d2> ("iommu/vt-d: Allocate DMAR fault interrupts
>>>> locally") moved the call to enable_drhd_fault_handling() to a code
>>>> path that does not hold any lock while traversing the drhd list. Fix
>>>> it by ensuring the dmar_global_lock lock is held when traversing the
>>>> drhd list.
>>>>
>>>> Without this fix, the following warning is triggered:
>>>>    =============================
>>>>    WARNING: suspicious RCU usage
>>>>    6.14.0-rc3 #55 Not tainted
>>>>    -----------------------------
>>>>    drivers/iommu/intel/dmar.c:2046 RCU-list traversed in non-reader section!!
>>>>                  other info that might help us debug this:
>>>>                  rcu_scheduler_active = 1, debug_locks = 1
>>>>    2 locks held by cpuhp/1/23:
>>>>    #0: ffffffff84a67c50 (cpu_hotplug_lock){++++}-{0:0}, at:
>>>> cpuhp_thread_fun+0x87/0x2c0
>>>>    #1: ffffffff84a6a380 (cpuhp_state-up){+.+.}-{0:0}, at:
>>>> cpuhp_thread_fun+0x87/0x2c0
>>>>    stack backtrace:
>>>>    CPU: 1 UID: 0 PID: 23 Comm: cpuhp/1 Not tainted 6.14.0-rc3 #55
>>>>    Call Trace:
>>>>     <TASK>
>>>>     dump_stack_lvl+0xb7/0xd0
>>>>     lockdep_rcu_suspicious+0x159/0x1f0
>>>>     ? __pfx_enable_drhd_fault_handling+0x10/0x10
>>>>     enable_drhd_fault_handling+0x151/0x180
>>>>     cpuhp_invoke_callback+0x1df/0x990
>>>>     cpuhp_thread_fun+0x1ea/0x2c0
>>>>     smpboot_thread_fn+0x1f5/0x2e0
>>>>     ? __pfx_smpboot_thread_fn+0x10/0x10
>>>>     kthread+0x12a/0x2d0
>>>>     ? __pfx_kthread+0x10/0x10
>>>>     ret_from_fork+0x4a/0x60
>>>>     ? __pfx_kthread+0x10/0x10
>>>>     ret_from_fork_asm+0x1a/0x30
>>>>     </TASK>
>>>>
>>>> Simply holding the lock in enable_drhd_fault_handling() will trigger a
>>>> lock order splat. Avoid holding the dmar_global_lock when calling
>>>> iommu_device_register(), which starts the device probe process.
>>> Can you elaborate the splat issue? It's not intuitive to me with a quick
>>> read of the code and iommu_device_register() is not occurred in above
>>> calling stack.
>> The lockdep splat looks like below:
> Thanks and it's clear now. Probably you can expand "to avoid unnecessary
> lock order splat " a little bit to mark the dead lock between dmar_global_lock
> and cpu_hotplug_lock (acquired in path of iommu_device_register()).

Yes, sure.