[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ZsvZ6rbIIDiQNNuk@xpf.sh.intel.com>
Date: Mon, 26 Aug 2024 09:27:06 +0800
From: Pengfei Xu <pengfei.xu@...el.com>
To: <linux-kernel@...r.kernel.org>
CC: <linux-tip-commits@...r.kernel.org>, Yang Jihong <yangjihong1@...wei.com>,
"Peter Zijlstra (Intel)" <peterz@...radead.org>, <x86@...nel.org>,
<pengfei.xu@...el.com>, <syzkaller-bugs@...glegroups.com>,
<stable@...r.kernel.org>
Subject: Re: [tip: perf/core] perf/core: Fix hardlockup failure caused by
perf throttle
Hi Jihong and perf experts,
Greetings!
There was "BUG: soft lockup in asm_sysvec_apic_timer_interrupt" in v6.11-rc4
mainline kernel by local syzkaller Intel internal kernel testing.
Bisected and found first bad commit:
"
15def34e2635 perf/core: Fix hardlockup failure caused by perf throttle
"
After reverted above commit on top of v6.11-rc4 mainline kernel, this
issue was gone.
And this issue could be reproduced in 1200s.
All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240823_212601_asm_sysvec_apic_timer_interrupt
Syzkaller repro code: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/repro.c
Syzkaller repro syscall steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/repro.prog
Syzkaller report: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/repro.report
Kconfig(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/bisect_info.log
bzImage: https://github.com/xupengfe/syzkaller_logs/raw/main/240823_212601_asm_sysvec_apic_timer_interrupt/bzImage_47ac09b91befbb6a235ab620c32af719f8208399.tar.gz
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240823_212601_asm_sysvec_apic_timer_interrupt/47ac09b91befbb6a235ab620c32af719f8208399_dmesg.log
"
[ 22.518698] hrtimer: interrupt took 13103 ns
[ 30.382700] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 2079936720 wd_nsec: 2079936828
[ 378.038693] clocksource: Long readout interval, skipping watchdog check: cs_nsec: 7719948786 wd_nsec: 7719948793
[ 736.508865] watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [repro:193160]
[ 736.509218] Modules linked in:
[ 736.509369] irq event stamp: 15405229
[ 736.509539] hardirqs last enabled at (15405228): [<ffffffff8579c9de>] irqentry_exit+0x3e/0xa0
[ 736.509947] hardirqs last disabled at (15405229): [<ffffffff8579ae14>] sysvec_apic_timer_interrupt+0x14/0xc0
[ 736.510383] softirqs last enabled at (9218742): [<ffffffff81289fb9>] __irq_exit_rcu+0xa9/0x120
[ 736.510776] softirqs last disabled at (9218745): [<ffffffff81289fb9>] __irq_exit_rcu+0xa9/0x120
[ 736.511167] CPU: 0 UID: 0 PID: 193160 Comm: repro Not tainted 6.11.0-rc4-47ac09b91bef+ #1
[ 736.511529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[ 736.512039] RIP: 0010:__rcu_read_unlock+0x284/0x560
[ 736.512272] Code: 8f 00 00 00 84 d2 0f 84 87 00 00 00 bf 09 00 00 00 e8 d0 0b dc ff 4d 85 f6 0f 84 68 fe ff ff e8 f2 83 26 00 fb 0f 1f 44 00 00 <e9> 58 fe ff ff 0f 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3
[ 736.513091] RSP: 0018:ffff88806c609938 EFLAGS: 00000212
[ 736.513330] RAX: 0000000000e956e0 RBX: ffff88806c649600 RCX: 1ffffffff14ae71b
[ 736.513648] RDX: 0000000000000000 RSI: 0000000000000101 RDI: 0000000000000000
[ 736.513974] RBP: ffff88806c609968 R08: 0000000000000001 R09: fffffbfff14a92b5
[ 736.514289] R10: ffffffff8a5495af R11: 0000000000000001 R12: ffff88801379ca00
[ 736.514606] R13: ffff88801379ca00 R14: 0000000000000200 R15: ffffffff86e619c0
[ 736.514935] FS: 00007f095e148740(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000
[ 736.515293] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 736.515555] CR2: 0000000020000000 CR3: 0000000021944006 CR4: 0000000000770ef0
[ 736.515888] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 736.516215] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
[ 736.516539] PKRU: 55555554
[ 736.516672] Call Trace:
[ 736.516798] <IRQ>
[ 736.516907] ? show_regs+0xa8/0xc0
[ 736.517080] ? watchdog_timer_fn+0x52f/0x6b0
[ 736.517284] ? __pfx_watchdog_timer_fn+0x10/0x10
[ 736.517503] ? __hrtimer_run_queues+0x5d6/0xb10
[ 736.517733] ? __pfx___hrtimer_run_queues+0x10/0x10
[ 736.517975] ? hrtimer_interrupt+0x324/0x7a0
[ 736.518193] ? __sysvec_apic_timer_interrupt+0x10b/0x410
[ 736.518443] ? debug_smp_processor_id+0x20/0x30
[ 736.518663] ? sysvec_apic_timer_interrupt+0x4b/0xc0
[ 736.518901] ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[ 736.519162] ? __rcu_read_unlock+0x284/0x560
[ 736.519370] ? __rcu_read_unlock+0x27e/0x560
[ 736.519579] __is_insn_slot_addr+0x14c/0x2a0
[ 736.519791] kernel_text_address+0x64/0xe0
[ 736.519991] __kernel_text_address+0x16/0x50
[ 736.520199] unwind_get_return_address+0x8c/0x100
[ 736.520424] ? __pfx_stack_trace_consume_entry+0x10/0x10
[ 736.520671] arch_stack_walk+0xa7/0x170
[ 736.520878] stack_trace_save+0x97/0xd0
[ 736.521064] ? __pfx_stack_trace_save+0x10/0x10
[ 736.521276] ? __pfx_mark_lock.part.0+0x10/0x10
[ 736.521499] kasan_save_stack+0x2c/0x60
[ 736.521681] ? kasan_save_stack+0x2c/0x60
[ 736.521870] ? kasan_save_track+0x18/0x40
[ 736.522057] ? kasan_save_free_info+0x3f/0x60
[ 736.522267] ? __kasan_slab_free+0x115/0x1a0
[ 736.522467] ? kfree+0xfe/0x330
[ 736.522622] ? free_ctx+0x22/0x30
[ 736.522788] ? rcu_core+0x877/0x18f0
[ 736.522967] ? rcu_core_si+0x12/0x20
[ 736.523142] ? handle_softirqs+0x1c7/0x870
[ 736.523334] ? __irq_exit_rcu+0xa9/0x120
[ 736.523519] ? irq_exit_rcu+0x12/0x30
[ 736.523693] ? sysvec_apic_timer_interrupt+0xa5/0xc0
[ 736.523922] ? asm_sysvec_apic_timer_interrupt+0x1f/0x30
[ 736.524165] ? lock_acquire+0x1ff/0x580
[ 736.524352] ? _raw_spin_lock+0x38/0x50
[ 736.524534] ? do_fcntl+0xa95/0x1400
[ 736.524709] ? __x64_sys_fcntl+0x179/0x210
[ 736.524906] ? x64_sys_call+0x5b9/0x20d0
[ 736.525094] ? do_syscall_64+0x6d/0x140
[ 736.525276] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 736.525529] ? __pfx___lock_acquire+0x10/0x10
[ 736.525736] ? do_raw_spin_trylock+0xbf/0x190
[ 736.525949] ? free_unref_page_commit+0x3c0/0xfb0
[ 736.526176] ? __this_cpu_preempt_check+0x21/0x30
[ 736.526399] ? lock_acquire+0x1de/0x580
[ 736.526587] ? free_ctx+0x22/0x30
[ 736.526753] kasan_save_track+0x18/0x40
[ 736.526944] kasan_save_free_info+0x3f/0x60
[ 736.527146] __kasan_slab_free+0x115/0x1a0
[ 736.527341] ? free_ctx+0x22/0x30
[ 736.527503] kfree+0xfe/0x330
[ 736.527655] ? rcu_core+0x875/0x18f0
[ 736.527833] free_ctx+0x22/0x30
[ 736.527991] rcu_core+0x877/0x18f0
[ 736.528165] ? __pfx_rcu_core+0x10/0x10
[ 736.528365] rcu_core_si+0x12/0x20
[ 736.528535] handle_softirqs+0x1c7/0x870
[ 736.528734] __irq_exit_rcu+0xa9/0x120
[ 736.528915] irq_exit_rcu+0x12/0x30
[ 736.529085] sysvec_apic_timer_interrupt+0xa5/0xc0
[ 736.529309] </IRQ>
[ 736.529414] <TASK>
[ 736.529522] asm_sysvec_apic_timer_interrupt+0x1f/0x30
[ 736.529762] RIP: 0010:lock_acquire+0x1ff/0x580
[ 736.529974] Code: 48 83 c4 28 e8 72 73 37 04 b8 ff ff ff ff 65 0f c1 05 fd b6 c0 7e 83 f8 01 0f 85 d9 02 00 00 4d 85 ff 74 06 fb 0f 1f 44 00 00 <48> b8 00 00 00 00 00 fc ff df 48 01 c3 48 c7 03 00 00 00 00 48 c7
[ 736.530786] RSP: 0018:ffff88801aa0fcd0 EFLAGS: 00000206
[ 736.531031] RAX: 0000000000000001 RBX: 1ffff11003541f9e RCX: 1ffff11003541f82
[ 736.531348] RDX: 1ffff110026f3b07 RSI: 0000000000000001 RDI: 0000000000000000
[ 736.531668] RBP: ffff88801aa0fdb8 R08: 0000000000000000 R09: fffffbfff14a92b5
[ 736.531994] R10: ffffffff8a5495af R11: 0000000000000001 R12: 0000000000000001
[ 736.532318] R13: 0000000000000000 R14: ffff88800f65a028 R15: 0000000000000200
[ 736.532660] ? __pfx_lock_acquire+0x10/0x10
[ 736.532860] ? _raw_spin_unlock+0x31/0x60
[ 736.533063] ? up_write+0x1c0/0x550
[ 736.533234] ? fasync_helper+0x77/0xc0
[ 736.533420] _raw_spin_lock+0x38/0x50
[ 736.533595] ? do_fcntl+0xa95/0x1400
[ 736.533775] do_fcntl+0xa95/0x1400
[ 736.533948] ? __pfx_do_fcntl+0x10/0x10
[ 736.534137] ? trace_hardirqs_on+0x51/0x60
[ 736.534336] ? seqcount_lockdep_reader_access.constprop.0+0xc0/0xd0
[ 736.534619] ? __sanitizer_cov_trace_cmp4+0x1a/0x20
[ 736.534853] ? __sanitizer_cov_trace_const_cmp4+0x1a/0x20
[ 736.535104] ? security_file_fcntl+0x9d/0xd0
[ 736.535315] __x64_sys_fcntl+0x179/0x210
[ 736.535508] x64_sys_call+0x5b9/0x20d0
[ 736.535687] do_syscall_64+0x6d/0x140
[ 736.535866] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 736.536099] RIP: 0033:0x7f095de3ee5d
[ 736.536273] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 93 af 1b 00 f7 d8 64 89 01 48
[ 736.537075] RSP: 002b:00007ffea8175048 EFLAGS: 00000217 ORIG_RAX: 0000000000000048
[ 736.537411] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f095de3ee5d
[ 736.537736] RDX: 0000000000002400 RSI: 0000000000000004 RDI: 0000000000000003
[ 736.538053] RBP: 00007ffea8175060 R08: 00007ffea8175060 R09: 00007ffea8175060
[ 736.538371] R10: 00007ffea8175060 R11: 0000000000000217 R12: 00007ffea81751d8
[ 736.538689] R13: 000000000040547a R14: 0000000000407df8 R15: 00007f095e193000
[ 736.539023] </TASK>
[ 736.539132] Kernel panic - not syncing: softlockup: hung tasks
"
Hope it's helpful.
Thanks!
---
If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.
How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
// Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@...alhost
After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@...alhost:/root/
Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has
Fill the bzImage file into above start3.sh to load the target kernel in vm.
Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
../configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install
Best Regards,
Thanks!
On 2023-04-17 at 10:46:22 -0000, tip-bot2 for Yang Jihong wrote:
> The following commit has been merged into the perf/core branch of tip:
>
> Commit-ID: 15def34e2635ab7e0e96f1bc32e1b69609f14942
> Gitweb: https://git.kernel.org/tip/15def34e2635ab7e0e96f1bc32e1b69609f14942
> Author: Yang Jihong <yangjihong1@...wei.com>
> AuthorDate: Mon, 27 Feb 2023 10:35:08 +08:00
> Committer: Peter Zijlstra <peterz@...radead.org>
> CommitterDate: Fri, 14 Apr 2023 16:08:22 +02:00
>
> perf/core: Fix hardlockup failure caused by perf throttle
>
> commit e050e3f0a71bf ("perf: Fix broken interrupt rate throttling")
> introduces a change in throttling threshold judgment. Before this,
> compare hwc->interrupts and max_samples_per_tick, then increase
> hwc->interrupts by 1, but this commit reverses order of these two
> behaviors, causing the semantics of max_samples_per_tick to change.
> In literal sense of "max_samples_per_tick", if hwc->interrupts ==
> max_samples_per_tick, it should not be throttled, therefore, the judgment
> condition should be changed to "hwc->interrupts > max_samples_per_tick".
>
> In fact, this may cause the hardlockup to fail, The minimum value of
> max_samples_per_tick may be 1, in this case, the return value of
> __perf_event_account_interrupt function is 1.
> As a result, nmi_watchdog gets throttled, which would stop PMU (Use x86
> architecture as an example, see x86_pmu_handle_irq).
>
> Fixes: e050e3f0a71b ("perf: Fix broken interrupt rate throttling")
> Signed-off-by: Yang Jihong <yangjihong1@...wei.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
> Link: https://lkml.kernel.org/r/20230227023508.102230-1-yangjihong1@huawei.com
> ---
> kernel/events/core.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index fb3e436..82b95b8 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -9433,8 +9433,8 @@ __perf_event_account_interrupt(struct perf_event *event, int throttle)
> hwc->interrupts = 1;
> } else {
> hwc->interrupts++;
> - if (unlikely(throttle
> - && hwc->interrupts >= max_samples_per_tick)) {
> + if (unlikely(throttle &&
> + hwc->interrupts > max_samples_per_tick)) {
> __this_cpu_inc(perf_throttled_count);
> tick_dep_set_cpu(smp_processor_id(), TICK_DEP_BIT_PERF_EVENTS);
> hwc->interrupts = MAX_INTERRUPTS;
Powered by blists - more mailing lists