linux-kernel - Re: [PATCH] perf/x86/amd: check event before enable to avoid GPF

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f3418d9-ae95-4caf-9a53-d763473ec69c@oracle.com>
Date: Tue, 4 Jun 2024 10:26:34 -0400
From: George Kennedy <george.kennedy@...cle.com>
To: Ravi Bangoria <ravi.bangoria@....com>
Cc: harshit.m.mogalapalli@...cle.com, peterz@...radead.org, mingo@...hat.com,
        acme@...nel.org, namhyung@...nel.org, mark.rutland@....com,
        alexander.shishkin@...ux.intel.com, jolsa@...nel.org,
        irogers@...gle.com, adrian.hunter@...el.com, kan.liang@...ux.intel.com,
        tglx@...utronix.de, bp@...en8.de, dave.hansen@...ux.intel.com,
        x86@...nel.org, hpa@...or.com, linux-perf-users@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] perf/x86/amd: check event before enable to avoid GPF



On 6/4/2024 9:40 AM, Ravi Bangoria wrote:
>> On 6/4/2024 9:16 AM, Ravi Bangoria wrote:
>>>>>>> Events can be deleted and the entry can be NULL.
>>>>>> Can you please also explain "how".
>>>>> It looks like x86_pmu_stop() is clearing the bit in active_mask and setting the events entry to NULL (and doing it in the correct order) for the same events index that amd_pmu_enable_all() is trying to enable.
>>>>>>> Check event for NULL in amd_pmu_enable_all() before enable to avoid a GPF.
>>>>>>> This appears to be an AMD only issue.
>>>>>>>
>>>>>>> Syzkaller reported a GPF in amd_pmu_enable_all.
>>>>>> Can you please provide a bug report link? Also, any reproducer?
>>>>> The Syzkaller reproducer can be found in this link:
>>>>> https://lore.kernel.org/netdev/CAMt6jhyec7-TSFpr3F+_ikjpu39WV3jnCBBGwpzpBrPx55w20g@mail.gmail.com/T/#u
>>>>>>> @@ -760,7 +760,8 @@ static void amd_pmu_enable_all(int added)
>>>>>>>             if (!test_bit(idx, cpuc->active_mask))
>>>>>>>                 continue;
>>>>>>>     -        amd_pmu_enable_event(cpuc->events[idx]);
>>>>>>> +        if (cpuc->events[idx])
>>>>>>> +            amd_pmu_enable_event(cpuc->events[idx]);
>>>>>> What if cpuc->events[idx] becomes NULL after if (cpuc->events[idx]) but
>>>>>> before amd_pmu_enable_event(cpuc->events[idx])?
>>>>> Good question, but the crash has not reproduced with the proposed fix in hours of testing. It usually reproduces within minutes without the fix.
>>>> Also, a similar fix is done in __intel_pmu_enable_all() in arch/x86/events/intel/core.c except that a WARN_ON_ONCE is done as well.
>>>> See: https://elixir.bootlin.com/linux/v6.10-rc1/source/arch/x86/events/intel/core.c#L2256
>>> There are subtle differences between Intel and AMD pmu implementation.
>>> __intel_pmu_enable_all() enables all event with single WRMSR whereas
>>> amd_pmu_enable_all() loops over each PMC and enables it individually.
>>>
>>> The WARN_ON_ONCE() is important because it will warn about potential
>>> sw bug somewhere else.
>> We could add a similar WARN_ON_ONCE() to the proposed patch.
> Sure, that would help in future. But for current splat, can you please
> try to rootcause the underlying race condition?

Sure, I can keep trying to root cause, but will need help from the AMD 
perf experts.

In the meantime, the proposed patch with the WARN_ON_ONCE() added like 
the Intel version would avoid a GPF and would potentially hint at root 
cause. BTW, reproduced on Ubuntu 22.04.4 LTS on AMD Baremetal. processor 
: 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD 
EPYC 7551 32-Core Processor stepping : 2 microcode : 0x800126e cpu MHz : 
1200.000 cache size : 512 KB physical id : 0 siblings : 64 core id : 0 
cpu cores : 32 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : 
yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 
apic sep mtrr pge mca cmov pat pse36 clf lush mmx fxsr sse sse2 ht 
syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl 
nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf rapl pni pclmulqdq 
monitor ssse3 fma cx 16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c 
rdrand lahf_lm cmp_legacy svm extapic cr8_le gacy abm sse4a misalignsse 
3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext 
perfctr_llc mwaitx cpb hw_pstate ssbd ibpb vmmcall fsgsbase bmi1 avx2 
smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 
clzero irperf xsaveerptr arat npt lbrv svm _lock nrip_save tsc_scale 
vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v 
_vmsave_vmload vgif overflow_recov succor smca bugs : sysret_ss_attrs 
null_seg spectre_v1 spectre_v2 spec_store_bypass retbleed s mt_rsb srso 
div0 bogomips : 3992.42 TLB size : 2560 4K pages clflush size : 64 
cache_alignment : 64 address sizes : 48 bits physical, 48 bits virtual 
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14] # nproc 
128 [ 165.692858] perf: interrupt took too long (36007 > 35837), 
lowering kernel.perf_event_max_sample_rate to 5000 [ 188.226736] Oops: 
general protection fault, probably for non-canonical address 
0xdffffc0000000034: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 188.228803] 
KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7] [ 
188.230029] CPU: 0 PID: 20434 Comm: repro_x86_pmu_e Not tainted 
6.10.0-rc1-21-ge0cce98fe279-syzk #1 [ 188.231472] Hardware name: QEMU 
Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [ 
188.232791] RIP: 0010:x86_pmu_enable_event+0x63/0x280 [ 188.233642] 
Code: 41 5c 41 5d 41 5e 41 5f e9 3a 84 99 00 e8 35 84 99 00 48 8d bb a0 
01 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 
02 65 4c 8b 25 a9 1e 01 7f 84 c0 74 08 3c 03 0f 8e ac 01 [ 188.236554] 
RSP: 0000:ffff888118209a38 EFLAGS: 00010012 [ 188.237406] RAX: 
dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 
188.238546] RDX: 0000000000000034 RSI: 0000000000000000 RDI: 
00000000000001a0 [ 188.239691] RBP: 0000000000000001 R08: 
0000000000000000 R09: 0000000000000000 [ 188.240830] R10: 
0000000000000000 R11: 0000000000000000 R12: ffff88811822a230 [ 
188.241959] R13: ffff88811822a420 R14: 0000000000000001 R15: 
fffffbfff32748b7 [ 188.243097] FS: 00007fa0554fb700(0000) 
GS:ffff888118200000(0000) knlGS:0000000000000000 [ 188.244374] CS: 0010 
DS: 0000 ES: 0000 CR0: 0000000080050033 [ 188.245291] CR2: 
00000000200001c0 CR3: 000000001c646000 CR4: 00000000000006f0 [ 
188.246418] Call Trace: [ 188.246848] <IRQ> [ 188.247199] ? 
show_regs+0x91/0xa0 [ 188.247765] ? die_addr+0x54/0xd0 [ 188.248334] ? 
exc_general_protection+0x15c/0x240 [ 188.249123] ? 
asm_exc_general_protection+0x26/0x30 [ 188.249922] ? 
x86_pmu_enable_event+0x63/0x280 [ 188.250669] ? 
x86_pmu_enable_event+0x4b/0x280 [ 188.251421] 
amd_pmu_enable_all+0x109/0x180 [ 188.252108] x86_pmu_enable+0x773/0xca0 
[ 188.252739] ? amd_pmu_del_event+0x42/0x70 [ 188.253415] ? 
perf_event_update_time+0x294/0x3a0 [ 188.254190] 
event_sched_out+0x7a1/0xd50 [ 188.254862] 
__perf_remove_from_context+0xfa/0xe70 [ 188.255650] 
event_function+0x275/0x450 [ 188.256293] ? 
__pfx___perf_remove_from_context+0x10/0x10 [ 188.257174] ? 
__pfx_event_function+0x10/0x10 [ 188.257891] remote_function+0x12e/0x1c0 
[ 188.258552] __flush_smp_call_function_queue+0x1c6/0xcb0 [ 188.259433] 
? __pfx_remote_function+0x10/0x10 [ 188.260175] 
__sysvec_call_function_single+0x2a/0x210 [ 188.261000] 
sysvec_call_function_single+0x36/0x90 [ 188.261789] 
asm_sysvec_call_function_single+0x1a/0x20 qemu-system-x86_64 -m 4096 
-smp 4 -net nic,model=virtio -net 
user,host=10.0.2.10,hostfwd=tcp::1569-:22 -display none -serial 
mon:stdio -no-reboot -enable-kvm -initrd 
/var/opt/do_syzkaller_setup-1.0/fuzzer/images/initramfs.img -hda 
images/syzk_8.img -initrd images/initramfs.img -kernel 
images/bzImage.UPSTREAM.v6.10-rc1-21-ge0cce98fe279 -snapshot -append 
'console=ttyS0 earlyprintk=serial oops=panic nmi_watchdog=panic 
panic_on_warn=0 loglevel=8 panic=86400 ftrace_dump_on_oops=orig_cpu 
rodata=n vsyscall=native biosdevname=0 root=/dev/sda console=ttyS0 
root=/dev/mapper/ol-root' In your attempts at crash reproduction, you 
have all modules built-in with KASAN config'd, correct? Thanks, George
>
> Thanks,
> Ravi