[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <202210072241.2b3cc734-oliver.sang@intel.com>
Date: Fri, 7 Oct 2022 22:42:57 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Valentin Schneider <vschneid@...hat.com>
CC: <lkp@...ts.01.org>, <lkp@...el.com>,
<linux-kernel@...r.kernel.org>, <linux-block@...r.kernel.org>,
Jens Axboe <axboe@...nel.dk>,
Yury Norov <yury.norov@...il.com>,
Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
Rasmus Villemoes <linux@...musvillemoes.dk>
Subject: [lib/cpumask] e5ad41dae2: BUG:workqueue_lockup-pool
Greeting,
FYI, we noticed the following commit (built with gcc-11):
commit: e5ad41dae251946ecdcdc38bb8f639cd55a8eae1 ("[RFC PATCH bitmap-for-next 2/4] lib/cpumask: Fix cpumask_check() warning in cpumask_next_wrap*()")
url: https://github.com/intel-lab-lkp/linux/commits/Valentin-Schneider/lib-cpumask-blk_mq-Fix-blk_mq_hctx_next_cpu-vs-cpumask_check/20221006-202402
base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
patch link: https://lore.kernel.org/linux-block/20221006122112.663119-3-vschneid@redhat.com
in testcase: boot
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+----------------------------------------------+------------+------------+
| | d8e0ef5a1d | e5ad41dae2 |
+----------------------------------------------+------------+------------+
| boot_successes | 10 | 0 |
| boot_failures | 0 | 10 |
| BUG:workqueue_lockup-pool | 0 | 10 |
| INFO:rcu_sched_detected_stalls_on_CPUs/tasks | 0 | 10 |
| BUG:kernel_hang_in_boot_stage | 0 | 10 |
+----------------------------------------------+------------+------------+
If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Link: https://lore.kernel.org/r/202210072241.2b3cc734-oliver.sang@intel.com
[ 60.568059][ C0] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 58s!
[ 60.569057][ C0] Showing busy workqueues and worker pools:
[ 60.569663][ C0] workqueue events: flags=0x0
[ 60.570057][ C0] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 60.570064][ C0] pending: vmstat_shepherd
[ 90.776058][ C0] BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 88s!
[ 90.777057][ C0] Showing busy workqueues and worker pools:
[ 90.777819][ C0] workqueue events: flags=0x0
[ 90.778056][ C0] pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 refcnt=2
[ 90.778065][ C0] pending: vmstat_shepherd
[ 105.234045][ C0] rcu: INFO: rcu_sched detected stalls on CPUs/tasks:
[ 105.234045][ C0] (detected by 0, t=105002 jiffies, g=-1195, q=1 ncpus=2)
[ 105.234045][ C0] rcu: All QSes seen, last rcu_sched kthread activity 105002 (-194950--299952), jiffies_till_next_fqs=3, root ->qsmask 0x0
[ 105.234045][ C0] rcu: rcu_sched kthread starved for 105002 jiffies! g-1195 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0
[ 105.234045][ C0] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior.
[ 105.234045][ C0] rcu: RCU grace-period kthread stack dump:
[ 105.234045][ C0] task:rcu_sched state:R running task stack: 7484 pid: 11 ppid: 2 flags:0x00004000
[ 105.234045][ C0] Call Trace:
[ 105.234045][ C0] ? __schedule+0x58a/0x5b8
[ 105.234045][ C0] ? schedule+0x83/0xba
[ 105.234045][ C0] ? schedule_timeout+0x88/0xa5
[ 105.234045][ C0] ? del_timer_sync+0x7d/0x7d
[ 105.234045][ C0] ? rcu_gp_fqs_loop+0xef/0x294
[ 105.234045][ C0] ? rcu_gp_kthread+0xd4/0xf0
[ 105.234045][ C0] ? kthread+0xc0/0xc5
[ 105.234045][ C0] ? rcu_gp_init+0x4c4/0x4c4
[ 105.234045][ C0] ? kthread_complete_and_exit+0x1b/0x1b
[ 105.234045][ C0] ? ret_from_fork+0x19/0x24
[ 105.234045][ C0] rcu: Stack dump where RCU GP kthread last ran:
[ 105.234045][ C0] NMI backtrace for cpu 0
[ 105.234045][ C0] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.0.0-rc7-00395-ge5ad41dae251 #1
[ 105.234045][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[ 105.234045][ C0] Call Trace:
[ 105.234045][ C0] ? dump_stack_lvl+0x42/0x54
[ 105.234045][ C0] ? dump_stack+0xd/0x10
[ 105.234045][ C0] ? nmi_cpu_backtrace+0x96/0xb8
[ 105.234045][ C0] ? lapic_can_unplug_cpu+0x87/0x87
[ 105.234045][ C0] ? nmi_trigger_cpumask_backtrace+0x49/0xac
[ 105.234045][ C0] ? arch_trigger_cpumask_backtrace+0x15/0x17
[ 105.234045][ C0] ? rcu_check_gp_kthread_starvation+0x122/0x131
[ 105.234045][ C0] ? print_other_cpu_stall+0x264/0x2a9
[ 105.234045][ C0] ? print_other_cpu_stall+0x297/0x2a9
[ 105.234045][ C0] ? check_cpu_stall+0x174/0x1bd
[ 105.234045][ C0] ? rcu_sched_clock_irq+0xd7/0x186
[ 105.234045][ C0] ? update_process_times+0x45/0x60
[ 105.234045][ C0] ? tick_periodic+0xc0/0xcc
[ 105.234045][ C0] ? tick_handle_periodic+0x22/0x66
[ 105.234045][ C0] ? sysvec_call_function_single+0x2c/0x2c
[ 105.234045][ C0] ? __sysvec_apic_timer_interrupt+0xe4/0x182
[ 105.234045][ C0] ? sysvec_apic_timer_interrupt+0x1b/0x2e
[ 105.234045][ C0] ? handle_exception+0x133/0x133
[ 105.234045][ C0] ? rmi_firmware_update+0x3ab/0x3f7
[ 105.234045][ C0] ? sysvec_call_function_single+0x2c/0x2c
[ 105.234045][ C0] ? build_sched_domains+0x1e5/0x71c
[ 105.234045][ C0] ? sysvec_call_function_single+0x2c/0x2c
[ 105.234045][ C0] ? build_sched_domains+0x1e5/0x71c
[ 105.234045][ C0] ? sched_init_domains+0x73/0x77
[ 105.234045][ C0] ? sched_init_smp+0x26/0x6c
[ 105.234045][ C0] ? kernel_init_freeable+0x143/0x195
[ 105.234045][ C0] ? rest_init+0x13a/0x13a
[ 105.234045][ C0] ? kernel_init+0x17/0xf3
[ 105.234045][ C0] ? ret_from_fork+0x19/0x24
To reproduce:
# build kernel
cd linux
cp config-6.0.0-rc7-00395-ge5ad41dae251 .config
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
make HOSTCC=gcc-11 CC=gcc-11 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
cd <mod-install-dir>
find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.
--
0-DAY CI Kernel Test Service
https://01.org/lkp
View attachment "config-6.0.0-rc7-00395-ge5ad41dae251" of type "text/plain" (165111 bytes)
View attachment "job-script" of type "text/plain" (5089 bytes)
Download attachment "dmesg.xz" of type "application/x-xz" (8640 bytes)
Powered by blists - more mailing lists