[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202210211215.9dc6efb5-yujie.liu@intel.com>
Date: Fri, 21 Oct 2022 12:10:17 +0800
From: kernel test robot <yujie.liu@...el.com>
To: Matthew Wilcox <willy@...radead.org>
CC: <lkp@...ts.01.org>, <lkp@...el.com>,
Andrew Morton <akpm@...ux-foundation.org>,
<linux-kernel@...r.kernel.org>, <linux-mm@...ck.org>
Subject: [shmem] 7a7256d5f5: WARNING:possible_recursive_locking_detected
Greeting,
FYI, we noticed WARNING:possible_recursive_locking_detected due to commit (built with gcc-11):
commit: 7a7256d5f512b6c17957df7f59cf5e281b3ddba3 ("shmem: convert shmem_mfill_atomic_pte() to use a folio")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: kernel-selftests
version: kernel-selftests-x86_64-9313ba54-1_20221017
with following parameters:
sc_nr_hugepages: 2
group: vm
test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt
on test machine: 12 threads 1 sockets Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz (Coffee Lake) with 16G memory
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
[ 86.886825][ T5512] WARNING: possible recursive locking detected
[ 86.886826][ T5512] 6.0.0-rc3-00323-g7a7256d5f512 #1 Tainted: G S
[ 86.886843][ T501]
[ 86.887428][ T5512] --------------------------------------------
[ 86.887429][ T5512] userfaultfd/5512 is trying to acquire lock:
[ 86.887431][ T5512] ffff888436345f98 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault (arch/x86/include/asm/current.h:15 mm/memory.c:5630 mm/memory.c:5623)
[ 86.887457][ T5512]
[ 86.887457][ T5512] but task is already holding lock:
[ 86.887458][ T5512] ffff888436345f98 (&mm->mmap_lock#2){++++}-{3:3}, at: mcopy_atomic (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/linux/mmap_lock.h:35 include/linux/mmap_lock.h:118 mm/userfaultfd.c:543 mm/userfaultfd.c:688)
[ 86.887486][ T5512]
[ 86.887486][ T5512] other info that might help us debug this:
[ 86.887487][ T5512] Possible unsafe locking scenario:
[ 86.887487][ T5512]
[ 86.887488][ T5512] CPU0
[ 86.887488][ T5512] ----
[ 86.887489][ T5512] lock(&mm->mmap_lock#2);
[ 86.896241][ T5512] lock(&mm->mmap_lock#2);
[ 86.896691][ T5512]
[ 86.896691][ T5512] *** DEADLOCK ***
[ 86.896691][ T5512]
[ 86.897494][ T5512] May be due to missing lock nesting notation
[ 86.897494][ T5512]
[ 86.898311][ T5512] 1 lock held by userfaultfd/5512:
[ 86.898815][ T5512] #0: ffff888436345f98 (&mm->mmap_lock#2){++++}-{3:3}, at: mcopy_atomic (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/linux/mmap_lock.h:35 include/linux/mmap_lock.h:118 mm/userfaultfd.c:543 mm/userfaultfd.c:688)
[ 86.899759][ T5512]
[ 86.899759][ T5512] stack backtrace:
[ 86.900343][ T5512] CPU: 5 PID: 5512 Comm: userfaultfd Tainted: G S 6.0.0-rc3-00323-g7a7256d5f512 #1
[ 86.901389][ T5512] Hardware name: Dell Inc. Vostro 3670/0HVPDY, BIOS 1.5.11 12/24/2018
[ 86.902193][ T5512] Call Trace:
[ 86.902523][ T5512] <TASK>
[ 86.902815][ T5512] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
[ 86.903270][ T5512] validate_chain.cold (kernel/locking/lockdep.c:2988 kernel/locking/lockdep.c:3031 kernel/locking/lockdep.c:3816)
[ 86.903777][ T5512] ? check_prev_add (kernel/locking/lockdep.c:3785)
[ 86.904276][ T5512] ? check_prev_add (kernel/locking/lockdep.c:3785)
[ 86.904775][ T5512] ? pte_alloc_one (include/linux/mm.h:2336 include/linux/mm.h:2363 include/asm-generic/pgalloc.h:66 arch/x86/mm/pgtable.c:33)
[ 86.905243][ T5512] ? __alloc_pages_slowpath+0x1a80/0x1a80
[ 86.905910][ T5512] ? __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856)
[ 86.906406][ T5512] __lock_acquire (kernel/locking/lockdep.c:5053)
[ 86.906879][ T5512] lock_acquire (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5668 kernel/locking/lockdep.c:5631)
[ 86.907330][ T5512] ? __might_fault (arch/x86/include/asm/current.h:15 mm/memory.c:5630 mm/memory.c:5623)
[ 86.907797][ T5512] ? rcu_read_unlock (include/linux/rcupdate.h:735 (discriminator 5))
[ 86.908276][ T5512] ? lock_is_held_type (kernel/locking/lockdep.c:5407 kernel/locking/lockdep.c:5709)
[ 86.908777][ T5512] __might_fault (mm/memory.c:5630 mm/memory.c:5623)
[ 86.909231][ T5512] ? __might_fault (arch/x86/include/asm/current.h:15 mm/memory.c:5630 mm/memory.c:5623)
[ 86.909695][ T5512] _copy_from_user (arch/x86/include/asm/preempt.h:27 lib/usercopy.c:14)
[ 86.910165][ T5512] shmem_mfill_atomic_pte (mm/shmem.c:2422)
[ 86.910705][ T5512] mcopy_atomic (mm/userfaultfd.c:503 mm/userfaultfd.c:637 mm/userfaultfd.c:688)
[ 86.911158][ T5512] ? mcopy_atomic_pte (mm/userfaultfd.c:687)
[ 86.911657][ T5512] ? lock_is_held_type (kernel/locking/lockdep.c:5407 kernel/locking/lockdep.c:5709)
[ 86.912160][ T5512] ? __might_fault (mm/memory.c:5630 mm/memory.c:5623)
[ 86.912625][ T5512] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688)
[ 86.913086][ T5512] userfaultfd_copy (fs/userfaultfd.c:1739)
[ 86.913604][ T5512] ? __wake_userfault (fs/userfaultfd.c:1704)
[ 86.914088][ T5512] ? kernel_read (fs/read_write.c:451)
[ 86.914581][ T5512] ? vfs_fileattr_set (fs/ioctl.c:774)
[ 86.915086][ T5512] ? __fget_files (include/linux/rcupdate.h:285 include/linux/rcupdate.h:739 fs/file.c:914)
[ 86.915606][ T5512] ? __lock_release (kernel/locking/lockdep.c:5342)
[ 86.916106][ T5512] userfaultfd_ioctl (fs/userfaultfd.c:2023)
[ 86.916608][ T5512] ? lock_is_held_type (kernel/locking/lockdep.c:5407 kernel/locking/lockdep.c:5709)
[ 86.917121][ T5512] ? userfaultfd_continue (fs/userfaultfd.c:1990)
[ 86.917669][ T5512] ? __fget_files (include/linux/rcupdate.h:285 include/linux/rcupdate.h:739 fs/file.c:914)
[ 86.918146][ T5512] ? lock_release (kernel/locking/lockdep.c:466 kernel/locking/lockdep.c:5688)
[ 86.918615][ T5512] ? __fget_files (fs/file.c:917)
[ 86.919097][ T5512] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:870 fs/ioctl.c:856 fs/ioctl.c:856)
[ 86.919585][ T5512] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 86.920038][ T5512] ? do_syscall_64 (arch/x86/entry/common.c:87)
[ 86.920508][ T5512] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4526)
[ 86.921174][ T5512] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:120)
[ 86.921771][ T5512] RIP: 0033:0x7fe912f00e9b
[ 86.922223][ T5512] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1b 48 8b 44 24 18 64 48 2b 04 25 28 00
All code
========
0: 00 48 89 add %cl,-0x77(%rax)
3: 44 24 18 rex.R and $0x18,%al
6: 31 c0 xor %eax,%eax
8: 48 8d 44 24 60 lea 0x60(%rsp),%rax
d: c7 04 24 10 00 00 00 movl $0x10,(%rsp)
14: 48 89 44 24 08 mov %rax,0x8(%rsp)
19: 48 8d 44 24 20 lea 0x20(%rsp),%rax
1e: 48 89 44 24 10 mov %rax,0x10(%rsp)
23: b8 10 00 00 00 mov $0x10,%eax
28: 0f 05 syscall
2a:* 41 89 c0 mov %eax,%r8d <-- trapping instruction
2d: 3d 00 f0 ff ff cmp $0xfffff000,%eax
32: 77 1b ja 0x4f
34: 48 8b 44 24 18 mov 0x18(%rsp),%rax
39: 64 fs
3a: 48 rex.W
3b: 2b .byte 0x2b
3c: 04 25 add $0x25,%al
3e: 28 00 sub %al,(%rax)
Code starting with the faulting instruction
===========================================
0: 41 89 c0 mov %eax,%r8d
3: 3d 00 f0 ff ff cmp $0xfffff000,%eax
8: 77 1b ja 0x25
a: 48 8b 44 24 18 mov 0x18(%rsp),%rax
f: 64 fs
10: 48 rex.W
11: 2b .byte 0x2b
12: 04 25 add $0x25,%al
14: 28 00 sub %al,(%rax)
[ 86.924185][ T5512] RSP: 002b:00007fe90cbffc80 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 86.925036][ T5512] RAX: ffffffffffffffda RBX: 00007fe90cc00640 RCX: 00007fe912f00e9b
[ 86.925842][ T5512] RDX: 00007fe90cbffcf0 RSI: 00000000c028aa03 RDI: 0000000000000006
[ 86.926650][ T5512] RBP: 00007fe90cbffd30 R08: 0000000000000000 R09: 00007ffdc1acdd8f
[ 86.927457][ T5512] R10: 00007fe912e075d8 R11: 0000000000000246 R12: 00007fe90cc00640
[ 86.928260][ T5512] R13: 0000000000000000 R14: 00007fe912e88580 R15: 0000000000000000
[ 86.929070][ T5512] </TASK>
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@...el.com>
| Link: https://lore.kernel.org/r/202210211215.9dc6efb5-yujie.liu@intel.com
To reproduce:
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-6.0.0-rc3-00323-g7a7256d5f512" of type "text/plain" (168257 bytes)
View attachment "job-script" of type "text/plain" (5976 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (270844 bytes)
View attachment "job.yaml" of type "text/plain" (4982 bytes)
View attachment "reproduce" of type "text/plain" (273 bytes)
Powered by blists - more mailing lists