lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <67630ed5-7ace-4e7f-9d26-3d259c381488@linux.intel.com>
Date: Wed, 19 Mar 2025 09:56:06 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Joerg Roedel <joro@...tes.org>, Will Deacon <will@...nel.org>,
 Robin Murphy <robin.murphy@....com>, Kevin Tian <kevin.tian@...el.com>,
 chaitanya.kumar.borah@...el.com
Cc: iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
 stable@...r.kernel.org
Subject: Re: [PATCH 1/1] iommu/vt-d: Fix possible circular locking dependency

On 3/17/25 11:57, Lu Baolu wrote:
> We have recently seen report of lockdep circular lock dependency warnings
> on platforms like skykale and kabylake:
> 
>   ======================================================
>   WARNING: possible circular locking dependency detected
>   6.14.0-rc6-CI_DRM_16276-gca2c04fe76e8+ #1 Not tainted
>   ------------------------------------------------------
>   swapper/0/1 is trying to acquire lock:
>   ffffffff8360ee48 (iommu_probe_device_lock){+.+.}-{3:3},
>     at: iommu_probe_device+0x1d/0x70
> 
>   but task is already holding lock:
>   ffff888102c7efa8 (&device->physical_node_lock){+.+.}-{3:3},
>     at: intel_iommu_init+0xe75/0x11f0
> 
>   which lock already depends on the new lock.
> 
>   the existing dependency chain (in reverse order) is:
> 
>   -> #6 (&device->physical_node_lock){+.+.}-{3:3}:
>          __mutex_lock+0xb4/0xe40
>          mutex_lock_nested+0x1b/0x30
>          intel_iommu_init+0xe75/0x11f0
>          pci_iommu_init+0x13/0x70
>          do_one_initcall+0x62/0x3f0
>          kernel_init_freeable+0x3da/0x6a0
>          kernel_init+0x1b/0x200
>          ret_from_fork+0x44/0x70
>          ret_from_fork_asm+0x1a/0x30
> 
>   -> #5 (dmar_global_lock){++++}-{3:3}:
>          down_read+0x43/0x1d0
>          enable_drhd_fault_handling+0x21/0x110
>          cpuhp_invoke_callback+0x4c6/0x870
>          cpuhp_issue_call+0xbf/0x1f0
>          __cpuhp_setup_state_cpuslocked+0x111/0x320
>          __cpuhp_setup_state+0xb0/0x220
>          irq_remap_enable_fault_handling+0x3f/0xa0
>          apic_intr_mode_init+0x5c/0x110
>          x86_late_time_init+0x24/0x40
>          start_kernel+0x895/0xbd0
>          x86_64_start_reservations+0x18/0x30
>          x86_64_start_kernel+0xbf/0x110
>          common_startup_64+0x13e/0x141
> 
>   -> #4 (cpuhp_state_mutex){+.+.}-{3:3}:
>          __mutex_lock+0xb4/0xe40
>          mutex_lock_nested+0x1b/0x30
>          __cpuhp_setup_state_cpuslocked+0x67/0x320
>          __cpuhp_setup_state+0xb0/0x220
>          page_alloc_init_cpuhp+0x2d/0x60
>          mm_core_init+0x18/0x2c0
>          start_kernel+0x576/0xbd0
>          x86_64_start_reservations+0x18/0x30
>          x86_64_start_kernel+0xbf/0x110
>          common_startup_64+0x13e/0x141
> 
>   -> #3 (cpu_hotplug_lock){++++}-{0:0}:
>          __cpuhp_state_add_instance+0x4f/0x220
>          iova_domain_init_rcaches+0x214/0x280
>          iommu_setup_dma_ops+0x1a4/0x710
>          iommu_device_register+0x17d/0x260
>          intel_iommu_init+0xda4/0x11f0
>          pci_iommu_init+0x13/0x70
>          do_one_initcall+0x62/0x3f0
>          kernel_init_freeable+0x3da/0x6a0
>          kernel_init+0x1b/0x200
>          ret_from_fork+0x44/0x70
>          ret_from_fork_asm+0x1a/0x30
> 
>   -> #2 (&domain->iova_cookie->mutex){+.+.}-{3:3}:
>          __mutex_lock+0xb4/0xe40
>          mutex_lock_nested+0x1b/0x30
>          iommu_setup_dma_ops+0x16b/0x710
>          iommu_device_register+0x17d/0x260
>          intel_iommu_init+0xda4/0x11f0
>          pci_iommu_init+0x13/0x70
>          do_one_initcall+0x62/0x3f0
>          kernel_init_freeable+0x3da/0x6a0
>          kernel_init+0x1b/0x200
>          ret_from_fork+0x44/0x70
>          ret_from_fork_asm+0x1a/0x30
> 
>   -> #1 (&group->mutex){+.+.}-{3:3}:
>          __mutex_lock+0xb4/0xe40
>          mutex_lock_nested+0x1b/0x30
>          __iommu_probe_device+0x24c/0x4e0
>          probe_iommu_group+0x2b/0x50
>          bus_for_each_dev+0x7d/0xe0
>          iommu_device_register+0xe1/0x260
>          intel_iommu_init+0xda4/0x11f0
>          pci_iommu_init+0x13/0x70
>          do_one_initcall+0x62/0x3f0
>          kernel_init_freeable+0x3da/0x6a0
>          kernel_init+0x1b/0x200
>          ret_from_fork+0x44/0x70
>          ret_from_fork_asm+0x1a/0x30
> 
>   -> #0 (iommu_probe_device_lock){+.+.}-{3:3}:
>          __lock_acquire+0x1637/0x2810
>          lock_acquire+0xc9/0x300
>          __mutex_lock+0xb4/0xe40
>          mutex_lock_nested+0x1b/0x30
>          iommu_probe_device+0x1d/0x70
>          intel_iommu_init+0xe90/0x11f0
>          pci_iommu_init+0x13/0x70
>          do_one_initcall+0x62/0x3f0
>          kernel_init_freeable+0x3da/0x6a0
>          kernel_init+0x1b/0x200
>          ret_from_fork+0x44/0x70
>          ret_from_fork_asm+0x1a/0x30
> 
>   other info that might help us debug this:
> 
>   Chain exists of:
>     iommu_probe_device_lock --> dmar_global_lock -->
>       &device->physical_node_lock
> 
>    Possible unsafe locking scenario:
> 
>          CPU0                    CPU1
>          ----                    ----
>     lock(&device->physical_node_lock);
>                                  lock(dmar_global_lock);
>                                  lock(&device->physical_node_lock);
>     lock(iommu_probe_device_lock);
> 
>    *** DEADLOCK ***
> 
> This driver uses a global lock to protect the list of enumerated DMA
> remapping units. It is necessary due to the driver's support for dynamic
> addition and removal of remapping units at runtime.
> 
> Two distinct code paths require iteration over this remapping unit list:
> 
> - Device registration and probing: the driver iterates the list to
>    register each remapping unit with the upper layer IOMMU framework
>    and subsequently probe the devices managed by that unit.
> - Global configuration: Upper layer components may also iterate the list
>    to apply configuration changes.
> 
> The lock acquisition order between these two code paths was reversed. This
> caused lockdep warnings, indicating a risk of deadlock. Fix this warning
> by releasing the global lock before invoking upper layer interfaces for
> device registration.
> 
> Fixes: b150654f74bf ("iommu/vt-d: Fix suspicious RCU usage")
> Closes:https://lore.kernel.org/linux-iommu/ 
> SJ1PR11MB612953431F94F18C954C4A9CB9D32@...PR11MB6129.namprd11.prod.outlook.com/
> Cc:stable@...r.kernel.org
> Signed-off-by: Lu Baolu<baolu.lu@...ux.intel.com>

Queued this patch for iommu tree.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ