[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <f244df99-a063-af9d-d4a5-f23f906c4b9a@redhat.com>
Date: Wed, 24 Mar 2021 13:19:09 +0100
From: Paolo Bonzini <pbonzini@...hat.com>
To: Wanpeng Li <kernellwp@...il.com>,
syzbot <syzbot+b282b65c2c68492df769@...kaller.appspotmail.com>
Cc: Borislav Petkov <bp@...en8.de>, "H. Peter Anvin" <hpa@...or.com>,
Jim Mattson <jmattson@...gle.com>,
Joerg Roedel <joro@...tes.org>, kvm <kvm@...r.kernel.org>,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
Sean Christopherson <seanjc@...gle.com>,
syzkaller-bugs@...glegroups.com,
Thomas Gleixner <tglx@...utronix.de>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
the arch/x86 maintainers <x86@...nel.org>,
David Woodhouse <dwmw@...zon.co.uk>
Subject: Re: [syzbot] possible deadlock in scheduler_tick
On 24/03/21 12:34, Wanpeng Li wrote:
> Cc David Woodhouse,
> On Wed, 24 Mar 2021 at 18:11, syzbot
> <syzbot+b282b65c2c68492df769@...kaller.appspotmail.com> wrote:
>>
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit: 1c273e10 Merge tag 'zonefs-5.12-rc4' of git://git.kernel.o..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=13c0414ed00000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=6abda3336c698a07
>> dashboard link: https://syzkaller.appspot.com/bug?extid=b282b65c2c68492df769
>> userspace arch: i386
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17d86ad6d00000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17b8497cd00000
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+b282b65c2c68492df769@...kaller.appspotmail.com
>>
>> =====================================================
>> WARNING: HARDIRQ-safe -> HARDIRQ-unsafe lock order detected
>> 5.12.0-rc3-syzkaller #0 Not tainted
>> -----------------------------------------------------
>> syz-executor030/8435 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
>> ffffc90001a2a230 (&kvm->arch.pvclock_gtod_sync_lock){+.+.}-{2:2}, at: spin_lock include/linux/spinlock.h:354 [inline]
>> ffffc90001a2a230 (&kvm->arch.pvclock_gtod_sync_lock){+.+.}-{2:2}, at: get_kvmclock_ns+0x25/0x390 arch/x86/kvm/x86.c:2587
>>
>> and this task is already holding:
>> ffff8880b9d35198 (&rq->lock){-.-.}-{2:2}, at: rq_lock kernel/sched/sched.h:1321 [inline]
>> ffff8880b9d35198 (&rq->lock){-.-.}-{2:2}, at: __schedule+0x21c/0x21b0 kernel/sched/core.c:4990
>> which would create a new lock dependency:
>> (&rq->lock){-.-.}-{2:2} -> (&kvm->arch.pvclock_gtod_sync_lock){+.+.}-{2:2}
>>
>> but this new dependency connects a HARDIRQ-irq-safe lock:
>> (&rq->lock){-.-.}-{2:2}
>>
>> ... which became HARDIRQ-irq-safe at:
>> lock_acquire kernel/locking/lockdep.c:5510 [inline]
>> lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
>> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>> rq_lock kernel/sched/sched.h:1321 [inline]
>> scheduler_tick+0xa4/0x4b0 kernel/sched/core.c:4538
>> update_process_times+0x191/0x200 kernel/time/timer.c:1801
>> tick_periodic+0x79/0x230 kernel/time/tick-common.c:100
>> tick_handle_periodic+0x41/0x120 kernel/time/tick-common.c:112
>> timer_interrupt+0x3f/0x60 arch/x86/kernel/time.c:57
>> __handle_irq_event_percpu+0x303/0x8f0 kernel/irq/handle.c:156
>> handle_irq_event_percpu kernel/irq/handle.c:196 [inline]
>> handle_irq_event+0x102/0x290 kernel/irq/handle.c:213
>> handle_level_irq+0x256/0x6e0 kernel/irq/chip.c:650
>> generic_handle_irq_desc include/linux/irqdesc.h:158 [inline]
>> handle_irq arch/x86/kernel/irq.c:231 [inline]
>> __common_interrupt+0x9e/0x200 arch/x86/kernel/irq.c:250
>> common_interrupt+0x9f/0xd0 arch/x86/kernel/irq.c:240
>> asm_common_interrupt+0x1e/0x40 arch/x86/include/asm/idtentry.h:623
>> __raw_spin_unlock_irqrestore include/linux/spinlock_api_smp.h:161 [inline]
>> _raw_spin_unlock_irqrestore+0x38/0x70 kernel/locking/spinlock.c:191
>> __setup_irq+0xc72/0x1ce0 kernel/irq/manage.c:1737
>> request_threaded_irq+0x28a/0x3b0 kernel/irq/manage.c:2127
>> request_irq include/linux/interrupt.h:160 [inline]
>> setup_default_timer_irq arch/x86/kernel/time.c:70 [inline]
>> hpet_time_init+0x28/0x42 arch/x86/kernel/time.c:82
>> x86_late_time_init+0x58/0x94 arch/x86/kernel/time.c:94
>> start_kernel+0x3ee/0x496 init/main.c:1028
>> secondary_startup_64_no_verify+0xb0/0xbb
>>
>> to a HARDIRQ-irq-unsafe lock:
>> (&kvm->arch.pvclock_gtod_sync_lock){+.+.}-{2:2}
>>
>> ... which became HARDIRQ-irq-unsafe at:
>> ...
>> lock_acquire kernel/locking/lockdep.c:5510 [inline]
>> lock_acquire+0x1ab/0x740 kernel/locking/lockdep.c:5475
>> __raw_spin_lock include/linux/spinlock_api_smp.h:142 [inline]
>> _raw_spin_lock+0x2a/0x40 kernel/locking/spinlock.c:151
>> spin_lock include/linux/spinlock.h:354 [inline]
>> kvm_synchronize_tsc+0x459/0x1230 arch/x86/kvm/x86.c:2332
>> kvm_arch_vcpu_postcreate+0x73/0x180 arch/x86/kvm/x86.c:10183
>> kvm_vm_ioctl_create_vcpu arch/x86/kvm/../../../virt/kvm/kvm_main.c:3239 [inline]
>> kvm_vm_ioctl+0x1b2d/0x2800 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3839
>> kvm_vm_compat_ioctl+0x125/0x230 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4052
>> __do_compat_sys_ioctl+0x1d3/0x230 fs/ioctl.c:842
>> do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
>> __do_fast_syscall_32+0x56/0x90 arch/x86/entry/common.c:140
>> do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:165
>> entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
>>
>> other info that might help us debug this:
>>
>> Possible interrupt unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(&kvm->arch.pvclock_gtod_sync_lock);
>> local_irq_disable();
>> lock(&rq->lock);
>> lock(&kvm->arch.pvclock_gtod_sync_lock);
>> <Interrupt>
>> lock(&rq->lock);
>>
>
> The offender is get_kvmclock_ns() which is called in the context
> switch process. The bad commit is 30b5c851af7991ad0 ("KVM: x86/xen:
> Add support for vCPU runstate information").
>
I'll send a patch, thanks.
Paolo
Powered by blists - more mailing lists