[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <cfb23a18-ee56-4d3d-b0cf-fd47c0dc6f4b@intel.com>
Date: Wed, 13 Mar 2024 17:37:18 -0700
From: Dave Jiang <dave.jiang@...el.com>
To: Fenghua Yu <fenghua.yu@...el.com>, Vinod Koul <vkoul@...nel.org>
Cc: dmaengine@...r.kernel.org, linux-kernel <linux-kernel@...r.kernel.org>,
Terrence Xu <terrence.xu@...el.com>, "Zanussi, Tom" <tom.zanussi@...el.com>
Subject: Re: [PATCH] dmaengine: idxd: Fix oops during rmmod on single-CPU
platforms
On 3/13/24 2:40 PM, Fenghua Yu wrote:
> During the removal of the idxd driver, registered offline callback is
> invoked as part of the clean up process. However, on systems with only
> one CPU online, no valid target is available to migrate the
> perf context, resulting in a kernel oops:
>
> BUG: unable to handle page fault for address: 000000000002a2b8
> #PF: supervisor write access in kernel mode
> #PF: error_code(0x0002) - not-present page
> PGD 1470e1067 P4D 0
> Oops: 0002 [#1] PREEMPT SMP NOPTI
> CPU: 0 PID: 20 Comm: cpuhp/0 Not tainted 6.8.0-rc6-dsa+ #57
> Hardware name: Intel Corporation AvenueCity/AvenueCity, BIOS BHSDCRB1.86B.2492.D03.2307181620 07/18/2023
> RIP: 0010:mutex_lock+0x2e/0x50
> ...
> Call Trace:
> <TASK>
> __die+0x24/0x70
> page_fault_oops+0x82/0x160
> do_user_addr_fault+0x65/0x6b0
> __pfx___rdmsr_safe_on_cpu+0x10/0x10
> exc_page_fault+0x7d/0x170
> asm_exc_page_fault+0x26/0x30
> mutex_lock+0x2e/0x50
> mutex_lock+0x1e/0x50
> perf_pmu_migrate_context+0x87/0x1f0
> perf_event_cpu_offline+0x76/0x90 [idxd]
> cpuhp_invoke_callback+0xa2/0x4f0
> __pfx_perf_event_cpu_offline+0x10/0x10 [idxd]
> cpuhp_thread_fun+0x98/0x150
> smpboot_thread_fn+0x27/0x260
> smpboot_thread_fn+0x1af/0x260
> __pfx_smpboot_thread_fn+0x10/0x10
> kthread+0x103/0x140
> __pfx_kthread+0x10/0x10
> ret_from_fork+0x31/0x50
> __pfx_kthread+0x10/0x10
> ret_from_fork_asm+0x1b/0x30
> <TASK>
>
> Fix the issue by preventing the migration of the perf context to an
> invalid target.
>
> Fixes: 81dd4d4d6178 ("dmaengine: idxd: Add IDXD performance monitor support")
> Reported-by: Terrence Xu <terrence.xu@...el.com>
> Tested-by: Terrence Xu <terrence.xu@...el.com>
> Signed-off-by: Fenghua Yu <fenghua.yu@...el.com>
Cc: Tom Zanussi
> ---
> drivers/dma/idxd/perfmon.c | 9 +++------
> 1 file changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/dma/idxd/perfmon.c b/drivers/dma/idxd/perfmon.c
> index fdda6d604262..5e94247e1ea7 100644
> --- a/drivers/dma/idxd/perfmon.c
> +++ b/drivers/dma/idxd/perfmon.c
> @@ -528,14 +528,11 @@ static int perf_event_cpu_offline(unsigned int cpu, struct hlist_node *node)
> return 0;
>
> target = cpumask_any_but(cpu_online_mask, cpu);
> -
> /* migrate events if there is a valid target */
> - if (target < nr_cpu_ids)
> + if (target < nr_cpu_ids) {
> cpumask_set_cpu(target, &perfmon_dsa_cpu_mask);
> - else
> - target = -1;
> -
> - perf_pmu_migrate_context(&idxd_pmu->pmu, cpu, target);
> + perf_pmu_migrate_context(&idxd_pmu->pmu, cpu, target);
> + }
>
> return 0;
> }
Powered by blists - more mailing lists