lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 5 Dec 2021 21:30:39 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     David Matlack <dmatlack@...gle.com>
Cc:     0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...ts.01.org, Paolo Bonzini <pbonzini@...hat.com>,
        kvm@...r.kernel.org, Ben Gardon <bgardon@...gle.com>,
        Joerg Roedel <joro@...tes.org>,
        Jim Mattson <jmattson@...gle.com>,
        Wanpeng Li <wanpengli@...cent.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Sean Christopherson <seanjc@...gle.com>,
        Janis Schoetterl-Glausch <scgl@...ux.vnet.ibm.com>,
        Junaid Shahid <junaids@...gle.com>,
        Oliver Upton <oupton@...gle.com>,
        Harish Barathvajasankar <hbarath@...gle.com>,
        Peter Xu <peterx@...hat.com>, Peter Shier <pshier@...gle.com>,
        David Matlack <dmatlack@...gle.com>
Subject: [KVM]  d3750a0923:
 WARNING:possible_circular_locking_dependency_detected



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: d3750a09232a9af1e8c6bb3b06a6609d921eb506 ("[RFC PATCH 13/15] KVM: x86/mmu: Split large pages during CLEAR_DIRTY_LOG")
url: https://github.com/0day-ci/linux/commits/David-Matlack/KVM-x86-mmu-Eager-Page-Splitting-for-the-TDP-MMU/20211120-080051
base: https://git.kernel.org/cgit/virt/kvm/kvm.git queue
patch link: https://lore.kernel.org/kvm/20211119235759.1304274-14-dmatlack@google.com

in testcase: kernel-selftests
version: kernel-selftests-x86_64-a21458fc-1_20211128
with following parameters:

	group: kvm
	ucode: 0xe2

test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt


on test machine: 8 threads Intel(R) Core(TM) i7-6770HQ CPU @ 2.60GHz with 32G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[  280.691224][T10825] WARNING: possible circular locking dependency detected
[  280.698458][T10825] 5.15.0-12443-gd3750a09232a #1 Tainted: G          I
[  280.705780][T10825] ------------------------------------------------------
[  280.712843][T10825] dirty_log_test/10825 is trying to acquire lock:
[280.719317][T10825] ffffffff859d97c0 (fs_reclaim){+.+.}-{0:0}, at: kmem_cache_alloc (include/linux/sched/mm.h:228 mm/slab.h:492 mm/slub.c:3148 mm/slub.c:3242 mm/slub.c:3247) 
[  280.728159][T10825]
[  280.728159][T10825] but task is already holding lock:
[280.735565][T10825] ffffc90009b61018 (&(kvm)->mmu_lock){++++}-{2:2}, at: kvm_clear_dirty_log_protect (arch/x86/kvm/../../../virt/kvm/kvm_main.c:2176) 
[  280.745919][T10825]
[  280.745919][T10825] which lock already depends on the new lock.
[  280.745919][T10825]
[  280.756398][T10825]
[  280.756398][T10825] the existing dependency chain (in reverse order) is:
[  280.765486][T10825]
[  280.765486][T10825] -> #2 (&(kvm)->mmu_lock){++++}-{2:2}:
[280.773296][T10825] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5639 kernel/locking/lockdep.c:5602) 
[280.778318][T10825] _raw_write_lock (include/linux/rwlock_api_smp.h:209 kernel/locking/spinlock.c:300) 
[280.783422][T10825] kvm_mmu_notifier_invalidate_range_start (arch/x86/kvm/../../../virt/kvm/kvm_main.c:576 arch/x86/kvm/../../../virt/kvm/kvm_main.c:714) 
[280.790879][T10825] __mmu_notifier_invalidate_range_start (mm/mmu_notifier.c:494 mm/mmu_notifier.c:548) 
[280.798089][T10825] wp_page_copy (include/linux/mmu_notifier.h:459 mm/memory.c:3017) 
[280.803304][T10825] __handle_mm_fault (mm/memory.c:4569 mm/memory.c:4686) 
[280.808964][T10825] handle_mm_fault (mm/memory.c:4784) 
[280.814276][T10825] do_user_addr_fault (arch/x86/mm/fault.c:1397) 
[280.819805][T10825] exc_page_fault (arch/x86/include/asm/irqflags.h:29 arch/x86/include/asm/irqflags.h:70 arch/x86/include/asm/irqflags.h:132 arch/x86/mm/fault.c:1493 arch/x86/mm/fault.c:1541) 
[280.824952][T10825] asm_exc_page_fault (arch/x86/include/asm/idtentry.h:568) 
[  280.830336][T10825]
[  280.830336][T10825] -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}:
[280.839807][T10825] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5639 kernel/locking/lockdep.c:5602) 
[280.844868][T10825] fs_reclaim_acquire (mm/page_alloc.c:4552) 
[280.850354][T10825] __kmalloc_node (include/linux/sched/mm.h:228 mm/slab.h:492 mm/slub.c:3148 mm/slub.c:4467) 
[280.855538][T10825] alloc_cpumask_var_node (lib/cpumask.c:115) 
[280.861284][T10825] native_smp_prepare_cpus (arch/x86/kernel/smpboot.c:1373) 
[280.867316][T10825] kernel_init_freeable (include/linux/compiler.h:252 include/linux/init.h:124 init/main.c:1414 init/main.c:1599) 
[280.873070][T10825] kernel_init (init/main.c:1501) 
[280.877992][T10825] ret_from_fork (arch/x86/entry/entry_64.S:301) 
[  280.882914][T10825]
[  280.882914][T10825] -> #0 (fs_reclaim){+.+.}-{0:0}:
[280.890200][T10825] check_prev_add (kernel/locking/lockdep.c:3064) 
[280.895486][T10825] __lock_acquire (kernel/locking/lockdep.c:3187 kernel/locking/lockdep.c:3801 kernel/locking/lockdep.c:5027) 
[280.900838][T10825] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5639 kernel/locking/lockdep.c:5602) 
[280.905873][T10825] fs_reclaim_acquire (mm/page_alloc.c:4536 mm/page_alloc.c:4549) 
[280.911412][T10825] kmem_cache_alloc (include/linux/sched/mm.h:228 mm/slab.h:492 mm/slub.c:3148 mm/slub.c:3242 mm/slub.c:3247) 
[280.916733][T10825] kvm_mmu_topup_memory_cache (arch/x86/kvm/../../../virt/kvm/kvm_main.c:383) 
[280.922982][T10825] mmu_topup_split_caches (arch/x86/kvm/mmu/mmu.c:765) 
[280.928842][T10825] kvm_mmu_try_split_large_pages (arch/x86/kvm/mmu/mmu.c:5897) 
[280.935333][T10825] kvm_arch_mmu_enable_log_dirty_pt_masked (arch/x86/kvm/mmu/mmu.c:1457) 
[280.942762][T10825] kvm_clear_dirty_log_protect (arch/x86/kvm/../../../virt/kvm/kvm_main.c:2193) 
[280.949182][T10825] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:2215 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4511) 
[280.954337][T10825] __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:874 fs/ioctl.c:860 fs/ioctl.c:860) 
[280.959633][T10825] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) 
[280.964580][T10825] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) 
[  280.971044][T10825]
[  280.971044][T10825] other info that might help us debug this:
[  280.971044][T10825]
[  280.981447][T10825] Chain exists of:
[  280.981447][T10825]   fs_reclaim --> mmu_notifier_invalidate_range_start --> &(kvm)->mmu_lock
[  280.981447][T10825]
[  280.996135][T10825]  Possible unsafe locking scenario:
[  280.996135][T10825]
[  281.003699][T10825]        CPU0                    CPU1
[  281.009067][T10825]        ----                    ----
[  281.014443][T10825]   lock(&(kvm)->mmu_lock);
[  281.018989][T10825]                                lock(mmu_notifier_invalidate_range_start);
[  281.027803][T10825]                                lock(&(kvm)->mmu_lock);
[  281.034897][T10825]   lock(fs_reclaim);
[  281.038853][T10825]
[  281.038853][T10825]  *** DEADLOCK ***
[  281.038853][T10825]
[  281.047128][T10825] 2 locks held by dirty_log_test/10825:
[281.052687][T10825] #0: ffffc90009b610a8 (&kvm->slots_lock){+.+.}-{3:3}, at: kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:2213 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4511) 
[281.062371][T10825] #1: ffffc90009b61018 (&(kvm)->mmu_lock){++++}-{2:2}, at: kvm_clear_dirty_log_protect (arch/x86/kvm/../../../virt/kvm/kvm_main.c:2176) 
[  281.073672][T10825]
[  281.073672][T10825] stack backtrace:
[  281.079747][T10825] CPU: 5 PID: 10825 Comm: dirty_log_test Tainted: G          I       5.15.0-12443-gd3750a09232a #1
[  281.090909][T10825] Hardware name:  /NUC6i7KYB, BIOS KYSKLi70.86A.0041.2016.0817.1130 08/17/2016
[  281.099876][T10825] Call Trace:
[  281.103142][T10825]  <TASK>
[281.106036][T10825] dump_stack_lvl (lib/dump_stack.c:107) 
[281.110529][T10825] check_noncircular (kernel/locking/lockdep.c:2143) 
[281.115472][T10825] ? print_circular_bug+0x480/0x480 
[281.121341][T10825] ? mark_lock_irq (kernel/locking/lockdep.c:4564) 
[281.126269][T10825] ? is_bpf_text_address (kernel/bpf/core.c:713) 
[281.131478][T10825] ? mark_lock+0xca/0x1400 
[281.136540][T10825] ? mark_lock+0xca/0x1400 
[281.141578][T10825] check_prev_add (kernel/locking/lockdep.c:3064) 
[281.146353][T10825] __lock_acquire (kernel/locking/lockdep.c:3187 kernel/locking/lockdep.c:3801 kernel/locking/lockdep.c:5027) 
[281.151214][T10825] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5681) 
[281.156272][T10825] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4885) 
[281.162278][T10825] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5681) 
[281.167324][T10825] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) 
[281.173091][T10825] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5639 kernel/locking/lockdep.c:5602) 
[281.177732][T10825] ? kmem_cache_alloc (include/linux/sched/mm.h:228 mm/slab.h:492 mm/slub.c:3148 mm/slub.c:3242 mm/slub.c:3247) 
[281.182856][T10825] ? rcu_read_unlock (include/linux/rcupdate.h:717 (discriminator 5)) 
[281.187846][T10825] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5681) 
[281.193099][T10825] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4885) 
[281.199356][T10825] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5681) 
[281.204642][T10825] fs_reclaim_acquire (mm/page_alloc.c:4536 mm/page_alloc.c:4549) 
[281.209675][T10825] ? kmem_cache_alloc (include/linux/sched/mm.h:228 mm/slab.h:492 mm/slub.c:3148 mm/slub.c:3242 mm/slub.c:3247) 
[281.214625][T10825] ? kvm_mmu_topup_memory_cache (arch/x86/kvm/../../../virt/kvm/kvm_main.c:383) 
[281.220440][T10825] kmem_cache_alloc (include/linux/sched/mm.h:228 mm/slab.h:492 mm/slub.c:3148 mm/slub.c:3242 mm/slub.c:3247) 
[281.225225][T10825] ? lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5639 kernel/locking/lockdep.c:5602) 
[281.230012][T10825] kvm_mmu_topup_memory_cache (arch/x86/kvm/../../../virt/kvm/kvm_main.c:383) 
[281.235876][T10825] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5681) 
[281.241148][T10825] mmu_topup_split_caches (arch/x86/kvm/mmu/mmu.c:765) 
[281.246492][T10825] kvm_mmu_try_split_large_pages (arch/x86/kvm/mmu/mmu.c:5897) 
[281.252407][T10825] kvm_arch_mmu_enable_log_dirty_pt_masked (arch/x86/kvm/mmu/mmu.c:1457) 
[281.259324][T10825] kvm_clear_dirty_log_protect (arch/x86/kvm/../../../virt/kvm/kvm_main.c:2193) 
[281.265165][T10825] kvm_vm_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:2215 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4511) 
[281.269769][T10825] ? __lock_acquire (arch/x86/include/asm/bitops.h:214 (discriminator 9) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 9) kernel/locking/lockdep.c:199 (discriminator 9) kernel/locking/lockdep.c:5024 (discriminator 9)) 
[281.274716][T10825] ? kvm_unregister_device_ops (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4464) 
[281.280453][T10825] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4885) 
[281.286752][T10825] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4885) 
[281.292869][T10825] ? lock_is_held_type (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5681) 
[281.297878][T10825] ? fiemap_prep (fs/ioctl.c:778) 
[281.302467][T10825] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) 
[281.308131][T10825] ? rcu_read_lock_bh_held (kernel/rcu/update.c:120) 
[281.313458][T10825] ? lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5639 kernel/locking/lockdep.c:5602) 
[281.318131][T10825] ? find_held_lock (kernel/locking/lockdep.c:5130) 
[281.322907][T10825] ? lock_release (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5659) 
[281.327606][T10825] ? lock_downgrade (kernel/locking/lockdep.c:5645) 
[281.332467][T10825] ? rcu_read_lock_sched_held (kernel/rcu/update.c:306) 
[281.338192][T10825] ? __fget_files (fs/file.c:865) 
[281.342877][T10825] __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:874 fs/ioctl.c:860 fs/ioctl.c:860) 
[281.347634][T10825] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80) 
[281.352016][T10825] ? irqentry_exit_to_user_mode (kernel/entry/common.c:127 kernel/entry/common.c:315) 
[281.357667][T10825] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:568) 
[281.362599][T10825] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) 
[281.368259][T10825] ? rcu_read_lock_bh_held (kernel/rcu/update.c:120) 
[281.373557][T10825] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:568) 
[281.378614][T10825] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:568) 
[281.383562][T10825] ? lockdep_hardirqs_on (kernel/locking/lockdep.c:4356) 
[281.388772][T10825] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113) 
[  281.394893][T10825] RIP: 0033:0x7f9d40646427
[ 281.399484][T10825] Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48
All code
========
   0:	00 00                	add    %al,(%rax)
   2:	90                   	nop
   3:	48 8b 05 69 aa 0c 00 	mov    0xcaa69(%rip),%rax        # 0xcaa73
   a:	64 c7 00 26 00 00 00 	movl   $0x26,%fs:(%rax)
  11:	48 c7 c0 ff ff ff ff 	mov    $0xffffffffffffffff,%rax
  18:	c3                   	retq   
  19:	66 2e 0f 1f 84 00 00 	nopw   %cs:0x0(%rax,%rax,1)
  20:	00 00 00 
  23:	b8 10 00 00 00       	mov    $0x10,%eax
  28:	0f 05                	syscall 
  2a:*	48 3d 01 f0 ff ff    	cmp    $0xfffffffffffff001,%rax		<-- trapping instruction
  30:	73 01                	jae    0x33
  32:	c3                   	retq   
  33:	48 8b 0d 39 aa 0c 00 	mov    0xcaa39(%rip),%rcx        # 0xcaa73
  3a:	f7 d8                	neg    %eax


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.15.0-12443-gd3750a09232a" of type "text/plain" (177443 bytes)

View attachment "job-script" of type "text/plain" (5970 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (20460 bytes)

View attachment "job.yaml" of type "text/plain" (4760 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ