[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <202301161108.4c2174c6-yujie.liu@intel.com>
Date: Mon, 16 Jan 2023 12:14:02 +0800
From: kernel test robot <yujie.liu@...el.com>
To: Vipin Sharma <vipinsh@...gle.com>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>,
Sean Christopherson <seanjc@...gle.com>, <kvm@...r.kernel.org>,
<pbonzini@...hat.com>, <bgardon@...gle.com>, <dmatlack@...gle.com>,
<linux-kernel@...r.kernel.org>, Vipin Sharma <vipinsh@...gle.com>
Subject: Re: [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge
shadow page caches
Greeting,
FYI, we noticed BUG:sleeping_function_called_from_invalid_context_at_include/linux/sched/mm.h due to commit (built with gcc-11):
commit: 99e2853d906a7593e6a3f0e5bc7ecc503b6b9462 ("[Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches")
url: https://github.com/intel-lab-lkp/linux/commits/Vipin-Sharma/NUMA-aware-page-table-s-pages-allocation/20221222-104911
base: https://git.kernel.org/cgit/virt/kvm/kvm.git queue
patch link: https://lore.kernel.org/all/20221222023457.1764-2-vipinsh@google.com/
patch subject: [Patch v3 1/9] KVM: x86/mmu: Repurpose KVM MMU shrinker to purge shadow page caches
in testcase: kvm-unit-tests-qemu
version: kvm-unit-tests-x86_64-e11a0e2-1_20230106
on test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
[ 159.416792][T16345] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
[ 159.426638][T16345] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 16345, name: qemu-system-x86
[ 159.426641][T16345] preempt_count: 1, expected: 0
[ 159.426644][T16345] CPU: 122 PID: 16345 Comm: qemu-system-x86 Not tainted 6.1.0-rc8-00451-g99e2853d906a #1
[ 159.426647][T16345] Call Trace:
[ 159.426649][T16345] <TASK>
[159.426650][T16345] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[159.445592][T16345] __might_resched.cold (kernel/sched/core.c:9909)
[159.459683][T16345] ? __kvm_mmu_topup_memory_cache (arch/x86/kvm/../../../virt/kvm/kvm_main.c:411) kvm
[159.472465][T16345] __kmem_cache_alloc_node (include/linux/sched/mm.h:274 mm/slab.h:710 mm/slub.c:3318 mm/slub.c:3437)
[159.479626][T16345] ? kasan_set_track (mm/kasan/common.c:52)
[159.486869][T16345] ? __kvm_mmu_topup_memory_cache (arch/x86/kvm/../../../virt/kvm/kvm_main.c:411) kvm
[159.503129][T16345] __kmalloc_node (include/linux/kasan.h:211 mm/slab_common.c:955 mm/slab_common.c:962)
[159.510635][T16345] __kvm_mmu_topup_memory_cache (arch/x86/kvm/../../../virt/kvm/kvm_main.c:411) kvm
[159.525074][T16345] ? _raw_write_lock_irq (kernel/locking/spinlock.c:153)
[159.533706][T16345] ? down_read (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/locking/rwsem.c:176 kernel/locking/rwsem.c:181 kernel/locking/rwsem.c:249 kernel/locking/rwsem.c:1259 kernel/locking/rwsem.c:1269 kernel/locking/rwsem.c:1511)
[159.533710][T16345] mmu_topup_memory_caches (arch/x86/kvm/mmu/mmu.c:670 arch/x86/kvm/mmu/mmu.c:686) kvm
[159.547875][T16345] kvm_mmu_load (arch/x86/kvm/mmu/mmu.c:5436) kvm
[159.556325][T16345] vcpu_enter_guest+0x1ad7/0x30f0 kvm
[159.571283][T16345] ? ttwu_queue_wakelist (kernel/sched/core.c:3844 kernel/sched/core.c:3839)
[159.577747][T16345] ? vmx_prepare_switch_to_guest (arch/x86/kvm/vmx/vmx.c:1322) kvm_intel
[159.593219][T16345] ? kvm_check_and_inject_events (arch/x86/kvm/x86.c:10215) kvm
[159.600193][T16345] ? try_to_wake_up (include/linux/sched.h:2239 kernel/sched/core.c:4197)
[159.600197][T16345] ? kernel_fpu_begin_mask (arch/x86/kernel/fpu/core.c:137)
[159.616366][T16345] vcpu_run (arch/x86/kvm/x86.c:10687) kvm
[159.623697][T16345] ? fpu_swap_kvm_fpstate (arch/x86/kernel/fpu/core.c:368)
[159.623700][T16345] kvm_arch_vcpu_ioctl_run (arch/x86/kvm/x86.c:10908) kvm
[159.640555][T16345] kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4107) kvm
[159.649090][T16345] ? vfs_fileattr_set (fs/ioctl.c:774)
[159.649094][T16345] ? kvm_dying_cpu (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4063) kvm
[159.659190][T16345] ? do_futex (kernel/futex/syscalls.c:111)
[159.673538][T16345] ? __x64_sys_get_robust_list (kernel/futex/syscalls.c:87)
[159.673542][T16345] ? __x64_sys_rt_sigaction (kernel/signal.c:4242)
[159.680957][T16345] ? _raw_spin_lock_bh (kernel/locking/spinlock.c:169)
[159.680960][T16345] ? __x64_sys_futex (kernel/futex/syscalls.c:183 kernel/futex/syscalls.c:164 kernel/futex/syscalls.c:164)
[159.697043][T16345] ? __fget_files (arch/x86/include/asm/atomic64_64.h:22 include/linux/atomic/atomic-arch-fallback.h:2363 include/linux/atomic/atomic-arch-fallback.h:2388 include/linux/atomic/atomic-arch-fallback.h:2404 include/linux/atomic/atomic-long.h:497 include/linux/atomic/atomic-instrumented.h:1854 fs/file.c:882 fs/file.c:913)
[159.697047][T16345] __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856)
[159.704290][T16345] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[159.714133][T16345] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 159.714136][T16345] RIP: 0033:0x7f1ca20ffcc7
[ 159.728227][T16345] Code: 00 00 00 48 8b 05 c9 91 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 99 91 0c 00 f7 d8 64 89 01 48
All code
========
0: 00 00 add %al,(%rax)
2: 00 48 8b add %cl,-0x75(%rax)
5: 05 c9 91 0c 00 add $0xc91c9,%eax
a: 64 c7 00 26 00 00 00 movl $0x26,%fs:(%rax)
11: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
18: c3 retq
19: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
20: 00 00 00
23: b8 10 00 00 00 mov $0x10,%eax
28: 0f 05 syscall
2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
30: 73 01 jae 0x33
32: c3 retq
33: 48 8b 0d 99 91 0c 00 mov 0xc9199(%rip),%rcx # 0xc91d3
3a: f7 d8 neg %eax
3c: 64 89 01 mov %eax,%fs:(%rcx)
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
6: 73 01 jae 0x9
8: c3 retq
9: 48 8b 0d 99 91 0c 00 mov 0xc9199(%rip),%rcx # 0xc91a9
10: f7 d8 neg %eax
12: 64 89 01 mov %eax,%fs:(%rcx)
15: 48 rex.W
[ 159.728230][T16345] RSP: 002b:00007f1ca11ea848 EFLAGS: 00000246
[ 159.736078][T16345] ORIG_RAX: 0000000000000010
[ 159.736080][T16345] RAX: ffffffffffffffda RBX: 000000000000ae80 RCX: 00007f1ca20ffcc7
[ 159.736082][T16345] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 000000000000000e
[ 159.751470][T16345] RBP: 0000555803999500 R08: 0000000000000000 R09: 0000555801cd6d80
[ 159.751472][T16345] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[ 159.761055][T16345] R13: 0000555801cdd060 R14: 00007f1ca11eab00 R15: 0000000000802000
[ 159.761058][T16345] </TASK>
[ 159.780317][T16345] x86/split lock detection: #AC: qemu-system-x86/16345 took a split_lock trap at address: 0x1e3
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@...el.com>
| Link: https://lore.kernel.org/oe-lkp/202301161108.4c2174c6-yujie.liu@intel.com
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
View attachment "config-6.1.0-rc8-00451-g99e2853d906a" of type "text/plain" (165863 bytes)
View attachment "job-script" of type "text/plain" (6166 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (122212 bytes)
View attachment "kvm-unit-tests-qemu" of type "text/plain" (228903 bytes)
View attachment "job.yaml" of type "text/plain" (4957 bytes)
View attachment "reproduce" of type "text/plain" (150 bytes)
Powered by blists - more mailing lists