lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zrul5kzUc-5BfWcT@google.com>
Date: Tue, 13 Aug 2024 11:28:54 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: Naresh Kamboju <naresh.kamboju@...aro.org>
Cc: pengfei.xu@...el.com, kan.liang@...ux.intel.com,
	linux-kernel@...r.kernel.org, linux-tip-commits@...r.kernel.org,
	peterz@...radead.org, syzkaller-bugs@...glegroups.com,
	x86@...nel.org, lkft-triage@...ts.linaro.org,
	dan.carpenter@...aro.org, anders.roxell@...aro.org, arnd@...db.de,
	Linux Kernel Functional Testing <lkft@...aro.org>,
	Andrii Nakryiko <andrii.nakryiko@...il.com>
Subject: Re: [tip: perf/core] perf: Fix event_function_call() locking

Hello,

On Tue, Aug 13, 2024 at 08:49:59PM +0530, Naresh Kamboju wrote:
> While running LTP test cases splice07 and perf_event_open01 found following
> kernel BUG running on arm64 device juno-r2 and qemu-arm64 on the Linux
> next-20240812 and next-20240813 tag.
> 
>   GOOD: next-20240809
>   BAD: next-20240812
> 
> Reported-by: Linux Kernel Functional Testing <lkft@...aro.org>
> 
> Test log:
> --------
> [ 2278.760258] check_preemption_disabled: 15 callbacks suppressed
> [ 2278.760282] BUG: using smp_processor_id() in preemptible [00000000] code: perf_event_open/111076
> [ 2278.775032] caller is debug_smp_processor_id+0x20/0x30
> [ 2278.780270] CPU: 5 UID: 0 PID: 111076 Comm: perf_event_open Not tainted 6.11.0-rc3-next-20240812 #1
> [ 2278.789344] Hardware name: ARM Juno development board (r2) (DT)
> [ 2278.795276] Call trace:
> [ 2278.797724]  dump_backtrace+0x9c/0x128
> [ 2278.801487]  show_stack+0x20/0x38
> [ 2278.804812]  dump_stack_lvl+0xbc/0xd0
> [ 2278.808487]  dump_stack+0x18/0x28
> [ 2278.811811]  check_preemption_disabled+0xd8/0xf8
> [ 2278.816446]  debug_smp_processor_id+0x20/0x30
> [ 2278.820818]  event_function_call+0x54/0x168
> [ 2278.825015]  _perf_event_enable+0x78/0xa8
> [ 2278.829037]  perf_event_for_each_child+0x44/0xa0
> [ 2278.833672]  _perf_ioctl+0x1bc/0xae0
> [ 2278.837262]  perf_ioctl+0x58/0x90
> [ 2278.840590]  __arm64_sys_ioctl+0xb4/0x100
> [ 2278.844615]  invoke_syscall+0x50/0x120
> [ 2278.848381]  el0_svc_common.constprop.0+0x48/0xf0
> [ 2278.853103]  do_el0_svc+0x24/0x38
> [ 2278.856432]  el0_svc+0x3c/0x108
> [ 2278.859585]  el0t_64_sync_handler+0x120/0x130
> [ 2278.863956]  el0t_64_sync+0x190/0x198
> [ 2279.068732] BUG: using smp_processor_id() in preemptible [00000000] code: perf_event_open/111076
> [ 2279.077570] caller is debug_smp_processor_id+0x20/0x30
> [ 2279.082754] CPU: 1 UID: 0 PID: 111076 Comm: perf_event_open Not tainted 6.11.0-rc3-next-20240812 #1
> [ 2279.091823] Hardware name: ARM Juno development board (r2) (DT)
> 
> Full test log:
> ---------
>  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240813/testrun/24833616/suite/log-parser-test/test/check-kernel-bug/log
>  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240813/testrun/24833616/suite/log-parser-test/tests/
>  - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240812/testrun/24821160/suite/log-parser-test/test/check-kernel-bug-483bde618da4ec98e33eefb5e26adeb267f80cc2461569605f3166ce12b3fe82/log
> 
> metadata:
>   artifact-location: https://storage.tuxsuite.com/public/linaro/lkft/builds/2kXsz6nJO7pJ1nL4xGlKHYhiLx9/
>   build-url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2kXsz6nJO7pJ1nL4xGlKHYhiLx9/
>   build_name: gcc-13-lkftconfig-debug
>   git_describe: next-20240812
>   git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next
>   git_sha: 9e6869691724b12e1f43655eeedc35fade38120c
>   kernel-config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2kXsz6nJO7pJ1nL4xGlKHYhiLx9/config
>   kernel_version: 6.11.0-rc3
>   toolchain: gcc-13

Thanks for the report, can you please check if it solves the problem?

Thanks,
Namhyung

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 9893ba5e98aa..85204c2376fa 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -298,13 +298,14 @@ static int event_function(void *info)
 static void event_function_call(struct perf_event *event, event_f func, void *data)
 {
 	struct perf_event_context *ctx = event->ctx;
-	struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
+	struct perf_cpu_context *cpuctx;
 	struct task_struct *task = READ_ONCE(ctx->task); /* verified in event_function */
 	struct event_function_struct efs = {
 		.event = event,
 		.func = func,
 		.data = data,
 	};
+	unsigned long flags;
 
 	if (!event->parent) {
 		/*
@@ -327,22 +328,27 @@ static void event_function_call(struct perf_event *event, event_f func, void *da
 	if (!task_function_call(task, event_function, &efs))
 		return;
 
+	local_irq_save(flags);
+	cpuctx = this_cpu_ptr(&perf_cpu_context);
+
 	perf_ctx_lock(cpuctx, ctx);
 	/*
 	 * Reload the task pointer, it might have been changed by
 	 * a concurrent perf_event_context_sched_out().
 	 */
 	task = ctx->task;
-	if (task == TASK_TOMBSTONE) {
-		perf_ctx_unlock(cpuctx, ctx);
-		return;
-	}
+	if (task == TASK_TOMBSTONE)
+		goto out;
+
 	if (ctx->is_active) {
 		perf_ctx_unlock(cpuctx, ctx);
+		local_irq_restore(flags);
 		goto again;
 	}
 	func(event, NULL, ctx, data);
+out:
 	perf_ctx_unlock(cpuctx, ctx);
+	local_irq_restore(flags);
 }
 
 /*

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ