lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <5837f1b6-b111-4ab5-888a-7a00891f3a34@amd.com>
Date: Mon, 7 Oct 2024 12:13:56 +0530
From: Ravi Bangoria <ravi.bangoria@....com>
To: George Kennedy <george.kennedy@...cle.com>, peterz@...radead.org,
 mingo@...hat.com
Cc: harshit.m.mogalapalli@...cle.com, acme@...nel.org, namhyung@...nel.org,
 mark.rutland@....com, alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
 irogers@...gle.com, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
 tglx@...utronix.de, bp@...en8.de, dave.hansen@...ux.intel.com,
 x86@...nel.org, hpa@...or.com, linux-perf-users@...r.kernel.org,
 linux-kernel@...r.kernel.org, dongli.zhang@...cle.com,
 Ravi Bangoria <ravi.bangoria@....com>
Subject: Re: [PATCH] [PATCH v2] perf/x86/amd: check event before enable to
 avoid GPF

On 01-Oct-24 1:22 AM, George Kennedy wrote:
> On AMD machines cpuc->events[idx] can become NULL in a subtle race
> condition with NMI->throttle->x86_pmu_stop().
> 
> Check event for NULL in amd_pmu_enable_all() before enable to avoid a GPF.
> This appears to be an AMD only issue.
> 
> Syzkaller reported a GPF in amd_pmu_enable_all.
> 
> INFO: NMI handler (perf_event_nmi_handler) took too long to run: 13.143
>     msecs
> Oops: general protection fault, probably for non-canonical address
>     0xdffffc0000000034: 0000  PREEMPT SMP KASAN NOPTI
> KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7]
> CPU: 0 UID: 0 PID: 328415 Comm: repro_36674776 Not tainted 6.12.0-rc1-syzk
> RIP: 0010:x86_pmu_enable_event (arch/x86/events/perf_event.h:1195
>     arch/x86/events/core.c:1430)
> RSP: 0018:ffff888118009d60 EFLAGS: 00010012
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000034 RSI: 0000000000000000 RDI: 00000000000001a0
> RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
> R13: ffff88811802a440 R14: ffff88811802a240 R15: ffff8881132d8601
> FS:  00007f097dfaa700(0000) GS:ffff888118000000(0000) GS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00000000200001c0 CR3: 0000000103d56000 CR4: 00000000000006f0
> Call Trace:
>  <IRQ>
> amd_pmu_enable_all (arch/x86/events/amd/core.c:760 (discriminator 2))
> x86_pmu_enable (arch/x86/events/core.c:1360)
> event_sched_out (kernel/events/core.c:1191 kernel/events/core.c:1186
>     kernel/events/core.c:2346)
> __perf_remove_from_context (kernel/events/core.c:2435)
> event_function (kernel/events/core.c:259)
> remote_function (kernel/events/core.c:92 (discriminator 1)
>     kernel/events/core.c:72 (discriminator 1))
> __flush_smp_call_function_queue (./arch/x86/include/asm/jump_label.h:27
>     ./include/linux/jump_label.h:207 ./include/trace/events/csd.h:64
>     kernel/smp.c:135 kernel/smp.c:540)
> __sysvec_call_function_single (./arch/x86/include/asm/jump_label.h:27
>     ./include/linux/jump_label.h:207
>     ./arch/x86/include/asm/trace/irq_vectors.h:99 arch/x86/kernel/smp.c:272)
> sysvec_call_function_single (arch/x86/kernel/smp.c:266 (discriminator 47)
>     arch/x86/kernel/smp.c:266 (discriminator 47))
>  </IRQ>
> 
> Reported-by: syzkaller <syzkaller@...glegroups.com>
> Signed-off-by: George Kennedy <george.kennedy@...cle.com>

Peter, Ingo,

I've been blocking this for quite some time without acting on it. Since
the patch is trivial and harmless, I'm giving an Ack here. However, the
underlying race condition is still unknown, so I will get back to it.

Acked-by: Ravi Bangoria <ravi.bangoria@....com>

> ---
> Ravi requested patch resend with the following:
>   /*
>    * FIXME: cpuc->events[idx] can become NULL in a subtle race
>    * condition with NMI->throttle->x86_pmu_stop().
>    */

George, the comment should be inside the code.

> 
>  arch/x86/events/amd/core.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/events/amd/core.c b/arch/x86/events/amd/core.c
> index 920e3a640cad..3a8b8db878f6 100644
> --- a/arch/x86/events/amd/core.c
> +++ b/arch/x86/events/amd/core.c
> @@ -762,7 +762,8 @@ static void amd_pmu_enable_all(int added)
>  		if (!test_bit(idx, cpuc->active_mask))
>  			continue;
>  
> -		amd_pmu_enable_event(cpuc->events[idx]);
> +		if (cpuc->events[idx])
> +			amd_pmu_enable_event(cpuc->events[idx]);
>  	}
>  }
>  

Thanks,
Ravi

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ