lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220222134629.GC21028@xsang-OptiPlex-9020>
Date:   Tue, 22 Feb 2022 21:46:30 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Muchun Song <songmuchun@...edance.com>
Cc:     0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...ts.01.org, guro@...com, hannes@...xchg.org,
        mhocko@...nel.org, akpm@...ux-foundation.org, shakeelb@...gle.com,
        vdavydov.dev@...il.com, linux-mm@...ck.org,
        duanxiongchun@...edance.com, fam.zheng@...edance.com,
        bsingharora@...il.com, shy828301@...il.com, alexs@...nel.org,
        smuchun@...il.com, zhengqi.arch@...edance.com,
        Muchun Song <songmuchun@...edance.com>
Subject: [mm]  edd4aa55af: WARNING:possible_recursive_locking_detected



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: edd4aa55af23c4e9844bb798bc2cd121c673a2b3 ("[PATCH v3 09/12] mm: memcontrol: use obj_cgroup APIs to charge the LRU pages")
url: https://github.com/0day-ci/linux/commits/Muchun-Song/Use-obj_cgroup-APIs-to-charge-the-LRU-pages/20220216-195348
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git c5d9ae265b105d9a67575fb67bd4650a6fc08e25
patch link: https://lore.kernel.org/lkml/20220216115132.52602-10-songmuchun@bytedance.com

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+---------------------------------------------+------------+------------+
|                                             | c5af5b5543 | edd4aa55af |
+---------------------------------------------+------------+------------+
| boot_successes                              | 10         | 0          |
| boot_failures                               | 0          | 6          |
| WARNING:possible_recursive_locking_detected | 0          | 6          |
+---------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[   63.584982][   T11] WARNING: possible recursive locking detected
[   63.586007][   T11] 5.17.0-rc4-00060-gedd4aa55af23 #1 Not tainted
[   63.587040][   T11] --------------------------------------------
[   63.588039][   T11] kworker/0:1/11 is trying to acquire lock:
[ 63.589045][ T11] ffff8881cb531068 (&lruvec->lru_lock){....}-{2:2}, at: memcg_reparent_lruvec_lock (include/linux/nodemask.h:271 mm/memcontrol.c:350) 
[   63.590703][   T11]
[   63.590703][   T11] but task is already holding lock:
[ 63.591989][ T11] ffff8881cc37a068 (&lruvec->lru_lock){....}-{2:2}, at: memcg_reparent_lruvec_lock (mm/memcontrol.c:352) 
[   63.593742][   T11]
[   63.593742][   T11] other info that might help us debug this:
[   63.595138][   T11]  Possible unsafe locking scenario:
[   63.595138][   T11]
[   63.596442][   T11]        CPU0
[   63.597075][   T11]        ----
[   63.597674][   T11]   lock(&lruvec->lru_lock);
[   63.598504][   T11]   lock(&lruvec->lru_lock);
[   63.599286][   T11]
[   63.599286][   T11]  *** DEADLOCK ***
[   63.599286][   T11]
[   63.600771][   T11]  May be due to missing lock nesting notation
[   63.600771][   T11]
[   63.602178][   T11] 4 locks held by kworker/0:1/11:
[ 63.603020][ T11] #0: ffff8881105a4938 ((wq_completion)cgroup_destroy){+.+.}-{0:0}, at: process_one_work (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1280 kernel/workqueue.c:631 kernel/workqueue.c:658 kernel/workqueue.c:2278) 
[ 63.604843][ T11] #1: ffffc900000bfdd8 ((work_completion)(&css->destroy_work)){+.+.}-{0:0}, at: process_one_work (kernel/workqueue.c:2282) 
[ 63.606798][ T11] #2: ffffffff9608d888 (cgroup_mutex){+.+.}-{3:3}, at: css_killed_work_fn (kernel/cgroup/cgroup.c:5271 kernel/cgroup/cgroup.c:5554) 
[ 63.608622][ T11] #3: ffff8881cc37a068 (&lruvec->lru_lock){....}-{2:2}, at: memcg_reparent_lruvec_lock (mm/memcontrol.c:352) 
[   63.610444][   T11]
[   63.610444][   T11] stack backtrace:
[   63.611453][   T11] CPU: 0 PID: 11 Comm: kworker/0:1 Not tainted 5.17.0-rc4-00060-gedd4aa55af23 #1
[   63.612981][   T11] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[   63.614550][   T11] Workqueue: cgroup_destroy css_killed_work_fn
[   63.615589][   T11] Call Trace:
[   63.616200][   T11]  <TASK>
[ 63.616754][ T11] dump_stack_lvl (lib/dump_stack.c:107) 
[ 63.617569][ T11] validate_chain.cold (kernel/locking/lockdep.c:2956 kernel/locking/lockdep.c:2999 kernel/locking/lockdep.c:3788) 
[ 63.618445][ T11] ? check_prev_add (kernel/locking/lockdep.c:3757) 
[ 63.619297][ T11] ? lock_is_held_type (kernel/locking/lockdep.c:5380 kernel/locking/lockdep.c:5682) 
[ 63.620165][ T11] ? ida_free (lib/idr.c:521) 
[ 63.620972][ T11] __lock_acquire (kernel/locking/lockdep.c:5027) 
[ 63.621830][ T11] ? rcu_read_lock_bh_held (kernel/rcu/update.c:120) 
[ 63.622753][ T11] lock_acquire (kernel/locking/lockdep.c:438 kernel/locking/lockdep.c:5641 kernel/locking/lockdep.c:5604) 
[ 63.623533][ T11] ? memcg_reparent_lruvec_lock (include/linux/nodemask.h:271 mm/memcontrol.c:350) 
[ 63.624542][ T11] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5)) 
[ 63.625359][ T11] ? _raw_spin_unlock_irqrestore (arch/x86/include/asm/irqflags.h:45 arch/x86/include/asm/irqflags.h:80 arch/x86/include/asm/irqflags.h:138 include/linux/spinlock_api_smp.h:151 kernel/locking/spinlock.c:194) 
[ 63.626341][ T11] ? ida_free (lib/idr.c:521) 
[ 63.627103][ T11] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 kernel/locking/spinlock_debug.c:115) 
[ 63.627927][ T11] ? rwlock_bug+0xc0/0xc0 
[ 63.628741][ T11] _raw_spin_lock (include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 63.629502][ T11] ? memcg_reparent_lruvec_lock (include/linux/nodemask.h:271 mm/memcontrol.c:350) 
[ 63.630548][ T11] memcg_reparent_lruvec_lock (include/linux/nodemask.h:271 mm/memcontrol.c:350) 
[ 63.631513][ T11] mem_cgroup_css_offline (mm/memcontrol.c:427 mm/memcontrol.c:458 mm/memcontrol.c:5456) 
[ 63.632408][ T11] ? lock_is_held_type (kernel/locking/lockdep.c:5380 kernel/locking/lockdep.c:5682) 
[ 63.635239][ T11] css_killed_work_fn (kernel/cgroup/cgroup.c:5277 kernel/cgroup/cgroup.c:5554) 
[ 63.636106][ T11] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:212 include/trace/events/workqueue.h:108 kernel/workqueue.c:2312) 
[ 63.636963][ T11] ? rcu_read_unlock (include/linux/rcupdate.h:723 (discriminator 5)) 
[ 63.637781][ T11] ? pwq_dec_nr_in_flight (kernel/workqueue.c:2202) 
[ 63.638743][ T11] ? rwlock_bug+0xc0/0xc0 
[ 63.639722][ T11] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2455) 
[ 63.640451][ T11] ? __kthread_parkme (arch/x86/include/asm/bitops.h:207 (discriminator 4) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 4) kernel/kthread.c:271 (discriminator 4)) 
[ 63.641202][ T11] ? schedule (arch/x86/include/asm/bitops.h:207 (discriminator 1) include/asm-generic/bitops/instrumented-non-atomic.h:135 (discriminator 1) include/linux/thread_info.h:118 (discriminator 1) include/linux/sched.h:2127 (discriminator 1) kernel/sched/core.c:6371 (discriminator 1)) 
[ 63.641868][ T11] ? process_one_work (kernel/workqueue.c:2397) 
[ 63.642720][ T11] ? process_one_work (kernel/workqueue.c:2397) 
[ 63.643540][ T11] kthread (kernel/kthread.c:377) 
[ 63.644218][ T11] ? kthread_complete_and_exit (kernel/kthread.c:332) 
[ 63.645073][ T11] ret_from_fork (arch/x86/entry/entry_64.S:301) 
[   63.645776][   T11]  </TASK>
[  OK  ] Started Load Kernel Modules.
[  OK  ] Started Remount Root and Kernel File Systems.
[  OK  ] Mounted RPC Pipe File System.
[  OK  ] Mounted Huge Pages File System.
[  OK  ] Mounted Kernel Debug File System.
Starting Load/Save Random Seed...
Starting Create System Users...
Starting Apply Kernel Variables...
Mounting Kernel Configuration File System...
[  OK  ] Started Load/Save Random Seed.
[  OK  ] Started Apply Kernel Variables.
[  OK  ] Started Create System Users.
[  OK  ] Mounted Kernel Configuration File System.
Starting Create Static Device Nodes in /dev...
[  OK  ] Started Create Static Device Nodes in /dev.
Starting udev Kernel Device Manager...
[  OK  ] Reached target Local File Systems (Pre).
[  OK  ] Reached target Local File Systems.
Starting Preprocess NFS configuration...
[  OK  ] Started Journal Service.
Starting Flush Journal to Persistent Storage...
[  OK  ] Started udev Kernel Device Manager.
[   64.066657][    C0] random: fast init done
[  OK  ] Started Preprocess NFS configuration.
[  OK  ] Reached target NFS client services.
[  OK  ] Started Flush Journal to Persistent Storage.
Starting Create Volatile Files and Directories...
[  OK  ] Started Create Volatile Files and Directories.
Starting RPC bind portmap service...
Starting Network Time Synchronization...
Starting Update UTMP about System Boot/Shutdown...
[  OK  ] Started RPC bind portmap service.
[  OK  ] Reached target RPC Port Mapper.
[  OK  ] Reached target Remote File Systems (Pre).
[  OK  ] Reached target Remote File Systems.


To reproduce:

        # build kernel
	cd linux
	cp config-5.17.0-rc4-00060-gedd4aa55af23 .config
	make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
	make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
	cd <mod-install-dir>
	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure                   Open Source Technology Center
https://lists.01.org/hyperkitty/list/lkp@lists.01.org       Intel Corporation

Thanks,
Oliver Sang


View attachment "config-5.17.0-rc4-00060-gedd4aa55af23" of type "text/plain" (178643 bytes)

View attachment "job-script" of type "text/plain" (4973 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (15760 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ