[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220525071433.GA17614@xsang-OptiPlex-9020>
Date: Wed, 25 May 2022 15:14:33 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Muchun Song <songmuchun@...edance.com>
Cc: 0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
linux-mm@...ck.org, cgroups@...r.kernel.org, lkp@...ts.01.org,
hannes@...xchg.org, mhocko@...nel.org, roman.gushchin@...ux.dev,
shakeelb@...gle.com, duanxiongchun@...edance.com,
longman@...hat.com, Muchun Song <songmuchun@...edance.com>
Subject: [mm] bec0ae1210: WARNING:possible_recursive_locking_detected
Greeting,
FYI, we noticed the following commit (built with gcc-11):
commit: bec0ae12106e0cf12dd4e0e21eb0754b99be0ba2 ("[PATCH v4 09/11] mm: memcontrol: use obj_cgroup APIs to charge the LRU pages")
url: https://github.com/intel-lab-lkp/linux/commits/Muchun-Song/Use-obj_cgroup-APIs-to-charge-the-LRU-pages/20220524-143056
patch link: https://lore.kernel.org/linux-mm/20220524060551.80037-10-songmuchun@bytedance.com
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>
[ 41.024908][ T135] WARNING: possible recursive locking detected
[ 41.025923][ T135] 5.18.0-00009-gbec0ae12106e #1 Not tainted
[ 41.026805][ T135] --------------------------------------------
[ 41.027780][ T135] kworker/1:2/135 is trying to acquire lock:
[ 41.028743][ T135] ffff88815b545068 (&lruvec->lru_lock){....}-{2:2}, at: lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376)
[ 41.030324][ T135]
[ 41.030324][ T135] but task is already holding lock:
[ 41.031629][ T135] ffff8881a1c43068 (&lruvec->lru_lock){....}-{2:2}, at: lruvec_reparent_lock (mm/memcontrol.c:378)
[ 41.033231][ T135]
[ 41.033231][ T135] other info that might help us debug this:
[ 41.034551][ T135] Possible unsafe locking scenario:
[ 41.034551][ T135]
[ 41.035818][ T135] CPU0
[ 41.036409][ T135] ----
[ 41.037045][ T135] lock(&lruvec->lru_lock);
[ 41.037866][ T135] lock(&lruvec->lru_lock);
[ 41.039123][ T135]
[ 41.039123][ T135] *** DEADLOCK ***
[ 41.039123][ T135]
[ 41.040984][ T135] May be due to missing lock nesting notation
[ 41.040984][ T135]
[ 41.042567][ T135] 5 locks held by kworker/1:2/135:
[ 41.043472][ T135] #0: ffff88839d54b538 ((wq_completion)cgroup_destroy){+.+.}-{0:0}, at: process_one_work (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/workqueue.c:636 kernel/workqueue.c:663 kernel/workqueue.c:2260)
[ 41.045556][ T135] #1: ffffc90000e9fdb8 ((work_completion)(&css->destroy_work)){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:2264)
[ 41.047649][ T135] #2: ffffffffa46931c8 (cgroup_mutex){+.+.}-{3:3}, at: css_killed_work_fn (kernel/cgroup/cgroup.c:5271 kernel/cgroup/cgroup.c:5554)
[ 41.049171][ T135] #3: ffffffffa47fe2d8 (objcg_lock){....}-{2:2}, at: mem_cgroup_css_offline (mm/memcontrol.c:453 mm/memcontrol.c:463 mm/memcontrol.c:5382)
[ 41.050617][ T135] #4: ffff8881a1c43068 (&lruvec->lru_lock){....}-{2:2}, at: lruvec_reparent_lock (mm/memcontrol.c:378)
[ 41.052031][ T135]
[ 41.052031][ T135] stack backtrace:
[ 41.052926][ T135] CPU: 1 PID: 135 Comm: kworker/1:2 Not tainted 5.18.0-00009-gbec0ae12106e #1
[ 41.054190][ T135] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[ 41.055742][ T135] Workqueue: cgroup_destroy css_killed_work_fn
[ 41.056645][ T135] Call Trace:
[ 41.057138][ T135] <TASK>
[ 41.057628][ T135] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
[ 41.058392][ T135] validate_chain.cold (kernel/locking/lockdep.c:2958 kernel/locking/lockdep.c:3001 kernel/locking/lockdep.c:3790)
[ 41.059117][ T135] ? check_prev_add (kernel/locking/lockdep.c:3759)
[ 41.059888][ T135] __lock_acquire (kernel/locking/lockdep.c:5029)
[ 41.060579][ T135] lock_acquire (kernel/locking/lockdep.c:436 kernel/locking/lockdep.c:5643 kernel/locking/lockdep.c:5606)
[ 41.061280][ T135] ? lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376)
[ 41.062081][ T135] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5))
[ 41.062915][ T135] ? lock_acquire (kernel/locking/lockdep.c:436 kernel/locking/lockdep.c:5643 kernel/locking/lockdep.c:5606)
[ 41.063653][ T135] ? mem_cgroup_css_offline (mm/memcontrol.c:453 mm/memcontrol.c:463 mm/memcontrol.c:5382)
[ 41.064504][ T135] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 kernel/locking/spinlock_debug.c:115)
[ 41.065190][ T135] ? rwlock_bug+0xc0/0xc0
[ 41.065923][ T135] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154)
[ 41.066676][ T135] ? lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376)
[ 41.067455][ T135] lruvec_reparent_lock (include/linux/nodemask.h:271 mm/memcontrol.c:376)
[ 41.068227][ T135] mem_cgroup_css_offline (mm/memcontrol.c:453 mm/memcontrol.c:463 mm/memcontrol.c:5382)
[ 41.069103][ T135] ? lock_is_held_type (kernel/locking/lockdep.c:5382 kernel/locking/lockdep.c:5684)
[ 41.069858][ T135] css_killed_work_fn (kernel/cgroup/cgroup.c:5279 kernel/cgroup/cgroup.c:5554)
[ 41.070637][ T135] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294)
[ 41.071459][ T135] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5))
[ 41.072308][ T135] ? pwq_dec_nr_in_flight (kernel/workqueue.c:2184)
[ 41.073231][ T135] ? rwlock_bug+0xc0/0xc0
[ 41.073922][ T135] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437)
[ 41.074572][ T135] ? __kthread_parkme (arch/x86/include/asm/bitops.h:207 (discriminator 4) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 4) kernel/kthread.c:270 (discriminator 4))
[ 41.075220][ T135] ? schedule (arch/x86/include/asm/bitops.h:207 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:2154 (discriminator 1) kernel/sched/core.c:6462 (discriminator 1))
[ 41.075942][ T135] ? process_one_work (kernel/workqueue.c:2379)
[ 41.076755][ T135] ? process_one_work (kernel/workqueue.c:2379)
[ 41.077600][ T135] kthread (kernel/kthread.c:376)
[ 41.078174][ T135] ? kthread_complete_and_exit (kernel/kthread.c:331)
[ 41.078951][ T135] ret_from_fork (arch/x86/entry/entry_64.S:304)
[ 41.079668][ T135] </TASK>
[ OK ] Started Load Kernel Modules.
[ OK ] Mounted RPC Pipe File System.
[ OK ] Started Remount Root and Kernel File Systems.
[ OK ] Mounted Kernel Debug File System.
[ OK ] Mounted Huge Pages File System.
Starting Load/Save Random Seed...
Starting Create System Users...
Starting Apply Kernel Variables...
Mounting Kernel Configuration File System...
[ OK ] Started Load/Save Random Seed.
[ OK ] Started Create System Users.
[ OK ] Started Apply Kernel Variables.
[ OK ] Mounted Kernel Configuration File System.
Starting Create Static Device Nodes in /dev...
[ OK ] Started Create Static Device Nodes in /dev.
[ OK ] Reached target Local File Systems (Pre).
[ OK ] Reached target Local File Systems.
Starting Preprocess NFS configuration...
Starting udev Kernel Device Manager...
[ OK ] Started Journal Service.
[ OK ] Started Preprocess NFS configuration.
[ OK ] Reached target NFS client services.
Starting Flush Journal to Persistent Storage...
[ OK ] Started udev Kernel Device Manager.
[ OK ] Started Flush Journal to Persistent Storage.
Starting Create Volatile Files and Directories...
[ OK ] Started Create Volatile Files and Directories.
Starting Network Time Synchronization...
Starting RPC bind portmap service...
Starting Update UTMP about System Boot/Shutdown...
[ OK ] Started RPC bind portmap service.
[ OK ] Reached target RPC Port Mapper.
[ OK ] Reached target Remote File Systems (Pre).
[ OK ] Reached target Remote File Systems.
[ OK ] Started Update UTMP about System Boot/Shutdown.
[ OK ] Started Network Time Synchronization.
[ OK ] Reached target System Time Synchronized.
To reproduce:
# build kernel
cd linux
cp config-5.18.0-00009-gbec0ae12106e .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-5.18.0-00009-gbec0ae12106e" of type "text/plain" (167146 bytes)
View attachment "job-script" of type "text/plain" (4950 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (15336 bytes)
Powered by blists - more mailing lists