linux-kernel - Re: [PATCH] iommu/vt-d: Fix a IOMMU perfmon warning when CPU hotplug

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <92d4e37f-9bc0-101d-dd78-ae9957348c0d@linux.intel.com>
Date:   Sat, 25 Mar 2023 20:59:50 +0800
From:   Baolu Lu <baolu.lu@...ux.intel.com>
To:     kan.liang@...ux.intel.com, joro@...tes.org, will@...nel.org,
        dwmw2@...radead.org, robin.murphy@....com, iommu@...ts.linux.dev,
        linux-kernel@...r.kernel.org
Cc:     baolu.lu@...ux.intel.com
Subject: Re: [PATCH] iommu/vt-d: Fix a IOMMU perfmon warning when CPU hotplug

On 2023/3/24 23:11, kan.liang@...ux.intel.com wrote:
> From: Kan Liang<kan.liang@...ux.intel.com>
> 
> A warning can be triggered when hotplug CPU 0.
>   $ echo 0 > /sys/devices/system/cpu/cpu0/online
> 
> [11958.737635] ------------[ cut here ]------------
> [11958.742882] Voluntary context switch within RCU read-side critical
> section!
> [11958.742891] WARNING: CPU: 0 PID: 19 at kernel/rcu/tree_plugin.h:318
> rcu_note_context_switch+0x4f4/0x580
> [11958.860095] RIP: 0010:rcu_note_context_switch+0x4f4/0x580
> [11958.960360] Call Trace:
> [11958.963161]  <TASK>
> [11958.965565]  ? perf_event_update_userpage+0x104/0x150
> [11958.971293]  __schedule+0x8d/0x960
> [11958.975165]  ? perf_event_set_state.part.82+0x11/0x50
> [11958.980891]  schedule+0x44/0xb0
> [11958.984464]  schedule_timeout+0x226/0x310
> [11958.989017]  ? __perf_event_disable+0x64/0x1a0
> [11958.994054]  ? _raw_spin_unlock+0x14/0x30
> [11958.998605]  wait_for_completion+0x94/0x130
> [11959.003352]  __wait_rcu_gp+0x108/0x130
> [11959.007616]  synchronize_rcu+0x67/0x70
> [11959.011876]  ? invoke_rcu_core+0xb0/0xb0
> [11959.016333]  ? __bpf_trace_rcu_stall_warning+0x10/0x10
> [11959.022147]  perf_pmu_migrate_context+0x121/0x370
> [11959.027478]  iommu_pmu_cpu_offline+0x6a/0xa0
> [11959.032325]  ? iommu_pmu_del+0x1e0/0x1e0
> [11959.036782]  cpuhp_invoke_callback+0x129/0x510
> [11959.041825]  cpuhp_thread_fun+0x94/0x150
> [11959.046283]  smpboot_thread_fn+0x183/0x220
> [11959.050933]  ? sort_range+0x20/0x20
> [11959.054902]  kthread+0xe6/0x110
> [11959.058479]  ? kthread_complete_and_exit+0x20/0x20
> [11959.063911]  ret_from_fork+0x1f/0x30
> [11959.067982]  </TASK>
> [11959.070489] ---[ end trace 0000000000000000 ]---
> 
> The synchronize_rcu() will be invoked in the perf_pmu_migrate_context(),
> when migrating a PMU to a new CPU. However, the current for_each_iommu()
> is within RCU read-side critical section.
> 
> Use the dmar_global_lock to replace the RCU read lock when going through
> the drhd list.

This introduces the following lockdep warning.

The iommu_pmu_register() path holds the dmar_global_lock and applies for
cpuhp_state lock; On the contrary, the iommu_pmu_cpu_offline() path
holds the cpuhp_state and applies for the dmar_global_lock.

Can you please have a look into it?

[12861.883945] ======================================================
[12861.884750] WARNING: possible circular locking dependency detected
[12861.885556] 6.3.0-rc3-00033-g5ff8930d087a #491 Not tainted
[12861.886272] ------------------------------------------------------
[12861.887077] cpuhp/0/19 is trying to acquire lock:
[12861.887703] ffffffff91c1d0d0 (dmar_global_lock){++++}-{3:3}, at: 
iommu_pmu_cpu_offline+0x62/0xd0
[12861.888839]
                but task is already holding lock:
[12861.889605] ffffffff8e26ed40 (cpuhp_state-down){+.+.}-{0:0}, at: 
cpuhp_thread_fun+0x4b/0x210
[12861.890701]
                which lock already depends on the new lock.

[12861.891747]
                the existing dependency chain (in reverse order) is:
[12861.892713]
                -> #2 (cpuhp_state-down){+.+.}-{0:0}:
[12861.893525]        lock_acquire+0xc2/0x2a0
[12861.894074]        cpuhp_kick_ap_work+0x5b/0x1d0
[12861.894692]        _cpu_down+0xc3/0x300
[12861.895210]        cpu_down+0x2e/0x50
[12861.895708]        device_offline+0x8c/0xc0
[12861.896266]        online_store+0x4e/0xa0
[12861.896804]        kernfs_fop_write_iter+0x12a/0x1d0
[12861.897462]        vfs_write+0x313/0x4c0
[12861.897989]        ksys_write+0x60/0xe0
[12861.898507]        do_syscall_64+0x43/0x90
[12861.899055]        entry_SYSCALL_64_after_hwframe+0x72/0xdc
[12861.899783]
                -> #1 (cpu_hotplug_lock){++++}-{0:0}:
[12861.900595]        lock_acquire+0xc2/0x2a0
[12861.901143]        __cpuhp_setup_state+0x5b/0x1f0
[12861.901772]        iommu_pmu_register+0x1a2/0x200
[12861.902400]        intel_iommu_init+0x5de/0x880
[12861.903009]        pci_iommu_init+0x12/0x40
[12861.903567]        do_one_initcall+0x65/0x290
[12861.904155]        kernel_init_freeable+0x3cf/0x560
[12861.904804]        kernel_init+0x1a/0x140
[12861.905342]        ret_from_fork+0x29/0x50
[12861.905890]
                -> #0 (dmar_global_lock){++++}-{3:3}:
[12861.906702]        check_prevs_add+0x160/0xee0
[12861.907300]        __lock_acquire+0x143b/0x18c0
[12861.907909]        lock_acquire+0xc2/0x2a0
[12861.908457]        down_write+0x3f/0xc0
[12861.908976]        iommu_pmu_cpu_offline+0x62/0xd0
[12861.909614]        cpuhp_invoke_callback+0x175/0x670
[12861.910273]        cpuhp_thread_fun+0x19d/0x210
[12861.910882]        smpboot_thread_fn+0x11d/0x240
[12861.911501]        kthread+0xeb/0x120
[12861.911998]        ret_from_fork+0x29/0x50
[12861.912546]
                other info that might help us debug this:

[12861.913572] Chain exists of:
                  dmar_global_lock --> cpu_hotplug_lock --> cpuhp_state-down

[12861.914977]  Possible unsafe locking scenario:

[12861.915753]        CPU0                    CPU1
[12861.916358]        ----                    ----
[12861.916963]   lock(cpuhp_state-down);
[12861.917460]                                lock(cpu_hotplug_lock);
[12861.918269]                                lock(cpuhp_state-down);
[12861.919077]   lock(dmar_global_lock);
[12861.919575]
                 *** DEADLOCK ***

[12861.920350] 2 locks held by cpuhp/0/19:
[12861.920865]  #0: ffffffff8e26ebd0 (cpu_hotplug_lock){++++}-{0:0}, at: 
cpuhp_thread_fun+0x47/0x210
[12861.922014]  #1: ffffffff8e26ed40 (cpuhp_state-down){+.+.}-{0:0}, at: 
cpuhp_thread_fun+0x4b/0x210
[12861.923162]
                stack backtrace:
[12861.923747] CPU: 0 PID: 19 Comm: cpuhp/0 Not tainted 
6.3.0-rc3-00033-g5ff8930d087a #491
[12861.924775] Hardware name: Intel Corporation BIRCHSTREAM/BIRCHSTREAM, 
BIOS BHSDREL1.86B.0018.D47.2211031932 11/03/2022
[12861.926133] Call Trace:
[12861.926478]  <TASK>
[12861.926783]  dump_stack_lvl+0x4a/0x80
[12861.927282]  check_noncircular+0x102/0x120
[12861.927833]  ? alloc_list_entry+0x13/0x130
[12861.928384]  check_prevs_add+0x160/0xee0
[12861.928918]  __lock_acquire+0x143b/0x18c0
[12861.929461]  lock_acquire+0xc2/0x2a0
[12861.929950]  ? iommu_pmu_cpu_offline+0x62/0xd0
[12861.930550]  ? lock_is_held_type+0x9d/0x110
[12861.931113]  down_write+0x3f/0xc0
[12861.931572]  ? iommu_pmu_cpu_offline+0x62/0xd0
[12861.932171]  iommu_pmu_cpu_offline+0x62/0xd0
[12861.932749]  ? __pfx_iommu_pmu_cpu_offline+0x10/0x10
[12861.933408]  cpuhp_invoke_callback+0x175/0x670
[12861.934007]  ? lock_release+0xc3/0x250
[12861.934519]  cpuhp_thread_fun+0x19d/0x210
[12861.935059]  ? smpboot_thread_fn+0x2f/0x240
[12861.935618]  ? smpboot_thread_fn+0x69/0x240
[12861.936177]  ? smpboot_thread_fn+0x125/0x240
[12861.936757]  smpboot_thread_fn+0x11d/0x240
[12861.937308]  ? __pfx_smpboot_thread_fn+0x10/0x10
[12861.937927]  kthread+0xeb/0x120
[12861.938365]  ? __pfx_kthread+0x10/0x10
[12861.938874]  ret_from_fork+0x29/0x50
[12861.939370]  </TASK>

Best regards,
baolu