linux-kernel - Re: [PATCH] tracing/perf: disable preemption in perf

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250630143230.25b7a38a@batman.local.home>
Date: Mon, 30 Jun 2025 14:32:30 -0400
From: Steven Rostedt <rostedt@...dmis.org>
To: Aditya Chillara <quic_achillar@...cinc.com>
Cc: Ingo Molnar <mingo@...hat.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] tracing/perf: disable preemption in perf_trace calls

On Mon, 30 Jun 2025 22:35:28 +0530
Aditya Chillara <quic_achillar@...cinc.com> wrote:

> It is possible that perf_trace_sched_switch call is rescheduled causing
> NULL access in perf_trace_buf_alloc because of perf_trace_event_unreg
> called in between, leading to data abort:
> 
> [  363.202351][    T0] Unable to handle kernel paging request at virtual address ffffffba56d0a03c
> [  363.202355][    T0] Mem abort info:
> [  363.202355][    T0]   ESR = 0x0000000096000045
> [  363.202357][    T0]   EC = 0x25: DABT (current EL), IL = 32 bits
> [  363.202359][    T0]   SET = 0, FnV = 0
> [  363.202360][    T0]   EA = 0, S1PTW = 0
> [  363.202361][    T0]   FSC = 0x05: level 1 translation fault
> [  363.202363][    T0] Data abort info:
> [  363.202364][    T0]   ISV = 0, ISS = 0x00000045, ISS2 = 0x00000000
> [  363.202365][    T0]   CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> [  363.202367][    T0]   GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> [  363.202369][    T0] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000c9795000
> [  363.202371][    T0] [ffffffba56d0a03c] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
> [  363.202377][    T0] Internal error: Oops: 0000000096000045 [#1] PREEMPT SMP
> [  363.202383][    T0] Dumping ftrace buffer:
> [  363.202385][    T0]    (ftrace buffer empty)
> .
> .
> [  363.203036][    T0] CPU: 4 UID: 0 PID: 0 Comm: swapper/4 Tainted: G W  O \
> 	6.12.23-android16-5-maybe-dirty-4k #1 6d534d25cf9b12caecb741d5004c181a6a432b53
> [  363.203042][    T0] Tainted: [W]=WARN, [O]=OOT_MODULE
> [  363.203045][    T0] pstate: 834000c5 (Nzcv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
> [  363.203048][    T0] pc : perf_trace_buf_alloc+0x90/0xe0
> [  363.203057][    T0] lr : perf_trace_buf_alloc+0x40/0xe0
> [  363.203060][    T0] sp : ffffffc08016bc90
> [  363.203062][    T0] x29: ffffffc08016bc90 x28: 0000000000000000 x27: ffffff88083d0000
> [  363.203066][    T0] x26: ffffffba56d0a000 x25: 00000044684e8bd8 x24: 1198f2d01457f4e4
> [  363.203070][    T0] x23: 0000000000000000 x22: ffffff88083d0000 x21: ffffffc08016bccc
> [  363.203073][    T0] x20: ffffffc08016bcd0 x19: 0000000000000044 x18: ffffffd01580bec0
> [  363.203077][    T0] x17: 00000000d2dc9ceb x16: 00000000d2dc9ceb x15: 0000000000000010
> [  363.203080][    T0] x14: 0000000000000100 x13: 0000000000000100 x12: 0000000000000100
> [  363.203083][    T0] x11: 0000000000ff0100 x10: 0000000000000001 x9 : ffffffba56d0a000
> [  363.203087][    T0] x8 : ffffffba56d0a044 x7 : 0000000000000001 x6 : 000000548fd2bf93
> [  363.203090][    T0] x5 : 00000054908be990 x4 : 0000000000000000 x3 : ffffff8808329640
> [  363.203094][    T0] x2 : ffffffc08016bccc x1 : ffffffc08016bcd0 x0 : ffffffba56d0a000
> [  363.203097][    T0] Call trace:
> [  363.203098][    T0]  perf_trace_buf_alloc+0x90/0xe0
> [  363.203102][    T0]  perf_trace_sched_switch+0x74/0x190
> [  363.203107][    T0]  __schedule+0xbb0/0xeac
> [  363.203112][    T0]  schedule_idle+0x24/0x48
> [  363.203114][    T0]  do_idle+0x214/0x25c
> [  363.203117][    T0]  cpu_startup_entry+0x34/0x3c
> [  363.203119][    T0]  secondary_start_kernel+0x130/0x150
> [  363.203124][    T0]  __secondary_switched+0xc0/0xc4
> [  363.203129][    T0] Code: f8605908 d538d089 8b080120 8b33c008 (f81f811f)
> [  363.203131][    T0] ---[ end trace 0000000000000000 ]---
> [  363.203133][    T0] Kernel panic - not syncing: Oops: Fatal exception
> [  363.203135][    T0] SMP: stopping secondary CPUs
> [  363.203276][    T0]
> [  363.203276][    T0] Total active clients count: 0
> [  363.203280][    T0] Dumping ftrace buffer:
> [  363.203281][    T0]    (ftrace buffer empty)
> 
> Fix the issue by disabling preemption in perf_trace calls.

Does it really fix the issue?

1. perf_trace_##call() gets called either by DECLARE_EVENT_CLASS or by
   DECLARE_EVENT_SYSCALL_CLASS. The latter already disables preemption
   but the former does not because it is called by the tracepoint logic
   that has:

#define __DECLARE_TRACE(name, proto, args, cond, data_proto)		\
	__DECLARE_TRACE_COMMON(name, PARAMS(proto), PARAMS(args), PARAMS(data_proto)) \
	static inline void __do_trace_##name(proto)			\
	{								\
		if (cond) {						\
			guard(preempt_notrace)();			\
			__DO_TRACE_CALL(name, TP_ARGS(args));		\
		}							\
	}

   Where preemption should already be disabled.

2. trace_sched_switch() is called with interrupts disabled. So there
   should be no scheduling out in this trace event. So how did it get
   scheduled out in that trace event?

I think there's something else going on.

-- Steve


> 
> Signed-off-by: Aditya Chillara <quic_achillar@...cinc.com>
> ---
> Change-Id: I787347a7db4554cfd2cb74e0fc985699f1824c92
> ---
>  include/trace/perf.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/include/trace/perf.h b/include/trace/perf.h
> index 5800d13146c3d0cdf4eb474a3bb895dd71753de4..76994a5d8c55f4ca071070ff551a53a84fb1d95c 100644
> --- a/include/trace/perf.h
> +++ b/include/trace/perf.h
> @@ -66,6 +66,8 @@ perf_trace_##call(void *__data, proto)					\
>  	int __data_size;						\
>  	int rctx;							\
>  									\
> +	guard(preempt_notrace)();		\
> +									\
>  	__data_size = trace_event_get_offsets_##call(&__data_offsets, args); \
>  									\
>  	head = this_cpu_ptr(event_call->perf_events);			\
> 
> ---
> base-commit: 70575e77839f4c5337ce2653b39b86bb365a870e
> change-id: 20250630-fix-perf-trace-null-access-9661f910df72
> 
> Best regards,