[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276EEC28691FD6C77EC493A8CC42@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 20 Feb 2025 07:21:29 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Lu Baolu <baolu.lu@...ux.intel.com>, Joerg Roedel <joro@...tes.org>, "Will
Deacon" <will@...nel.org>, Robin Murphy <robin.murphy@....com>
CC: Ido Schimmel <idosch@...sch.org>, "iommu@...ts.linux.dev"
<iommu@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "stable@...r.kernel.org"
<stable@...r.kernel.org>
Subject: RE: [PATCH 1/1] iommu/vt-d: Fix suspicious RCU usage
> From: Lu Baolu <baolu.lu@...ux.intel.com>
> Sent: Tuesday, February 18, 2025 10:24 AM
>
> Commit <d74169ceb0d2> ("iommu/vt-d: Allocate DMAR fault interrupts
> locally") moved the call to enable_drhd_fault_handling() to a code
> path that does not hold any lock while traversing the drhd list. Fix
> it by ensuring the dmar_global_lock lock is held when traversing the
> drhd list.
>
> Without this fix, the following warning is triggered:
> =============================
> WARNING: suspicious RCU usage
> 6.14.0-rc3 #55 Not tainted
> -----------------------------
> drivers/iommu/intel/dmar.c:2046 RCU-list traversed in non-reader section!!
> other info that might help us debug this:
> rcu_scheduler_active = 1, debug_locks = 1
> 2 locks held by cpuhp/1/23:
> #0: ffffffff84a67c50 (cpu_hotplug_lock){++++}-{0:0}, at:
> cpuhp_thread_fun+0x87/0x2c0
> #1: ffffffff84a6a380 (cpuhp_state-up){+.+.}-{0:0}, at:
> cpuhp_thread_fun+0x87/0x2c0
> stack backtrace:
> CPU: 1 UID: 0 PID: 23 Comm: cpuhp/1 Not tainted 6.14.0-rc3 #55
> Call Trace:
> <TASK>
> dump_stack_lvl+0xb7/0xd0
> lockdep_rcu_suspicious+0x159/0x1f0
> ? __pfx_enable_drhd_fault_handling+0x10/0x10
> enable_drhd_fault_handling+0x151/0x180
> cpuhp_invoke_callback+0x1df/0x990
> cpuhp_thread_fun+0x1ea/0x2c0
> smpboot_thread_fn+0x1f5/0x2e0
> ? __pfx_smpboot_thread_fn+0x10/0x10
> kthread+0x12a/0x2d0
> ? __pfx_kthread+0x10/0x10
> ret_from_fork+0x4a/0x60
> ? __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1a/0x30
> </TASK>
>
> Simply holding the lock in enable_drhd_fault_handling() will trigger a
> lock order splat. Avoid holding the dmar_global_lock when calling
> iommu_device_register(), which starts the device probe process.
Can you elaborate the splat issue? It's not intuitive to me with a quick
read of the code and iommu_device_register() is not occurred in above
calling stack.
>
> Fixes: d74169ceb0d2 ("iommu/vt-d: Allocate DMAR fault interrupts locally")
> Reported-by: Ido Schimmel <idosch@...sch.org>
> Closes: https://lore.kernel.org/linux-iommu/Zx9OwdLIc_VoQ0-
> a@...edder.mtl.com/
> Cc: stable@...r.kernel.org
> Signed-off-by: Lu Baolu <baolu.lu@...ux.intel.com>
> ---
> drivers/iommu/intel/dmar.c | 1 +
> drivers/iommu/intel/iommu.c | 7 +++++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
> index 9f424acf474e..e540092d664d 100644
> --- a/drivers/iommu/intel/dmar.c
> +++ b/drivers/iommu/intel/dmar.c
> @@ -2043,6 +2043,7 @@ int enable_drhd_fault_handling(unsigned int cpu)
> /*
> * Enable fault control interrupt.
> */
> + guard(rwsem_read)(&dmar_global_lock);
> for_each_iommu(iommu, drhd) {
> u32 fault_status;
> int ret;
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index cc46098f875b..9a1e61b429ca 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -3146,7 +3146,14 @@ int __init intel_iommu_init(void)
> iommu_device_sysfs_add(&iommu->iommu, NULL,
> intel_iommu_groups,
> "%s", iommu->name);
> + /*
> + * The iommu device probe is protected by the
> iommu_probe_device_lock.
> + * Release the dmar_global_lock before entering the device
> probe path
> + * to avoid unnecessary lock order splat.
> + */
> + up_read(&dmar_global_lock);
> iommu_device_register(&iommu->iommu,
> &intel_iommu_ops, NULL);
> + down_read(&dmar_global_lock);
>
> iommu_pmu_register(iommu);
> }
> --
> 2.43.0
Powered by blists - more mailing lists