lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20220519143252.GD1907@xsang-OptiPlex-9020>
Date:   Thu, 19 May 2022 22:32:52 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Jchao Sun <sunjunchao2870@...il.com>
Cc:     0day robot <lkp@...el.com>, LKML <linux-kernel@...r.kernel.org>,
        lkp@...ts.01.org, linux-fsdevel@...r.kernel.org,
        viro@...iv.linux.org.uk, jack@...e.cz,
        Jchao Sun <sunjunchao2870@...il.com>
Subject: [writeback]  b47372dac9:
 BUG:soft_lockup-CPU##stuck_for#s![kworker/u16:#:#]



Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: b47372dac98f419466a5459b4eb98f9f88b7aa38 ("[PATCH v4] writeback: Fix inode->i_io_list not be protected by inode->i_lock error")
url: https://github.com/intel-lab-lkp/linux/commits/Jchao-Sun/writeback-Fix-inode-i_io_list-not-be-protected-by-inode-i_lock-error/20220511-221801
base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git feb9c5e19e913b53cb536a7aa7c9f20107bb51ec
patch link: https://lore.kernel.org/linux-fsdevel/20220511141518.1895-1-sunjunchao2870@gmail.com

in testcase: xfstests
version: xfstests-x86_64-46e1b83-1_20220414
with following parameters:

	disk: 4HDD
	fs: f2fs
	test: generic-group-03
	ucode: 0x21

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-3770K CPU @ 3.50GHz with 16G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[  116.563986][    C1] watchdog: BUG: soft lockup - CPU#1 stuck for 26s! [kworker/u16:3:56]
[  116.564176][    C1] Modules linked in: dm_mod f2fs crc32_generic netconsole btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c intel_rapl_msr intel_rapl_common i915 sd_mod t10_pi intel_gtt x86_pkg_temp_thermal drm_buddy intel_powerclamp crc64_rocksoft_generic drm_dp_helper crc64_rocksoft crc64 ttm sg coretemp ipmi_devintf kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ipmi_msghandler crc32c_intel ghash_clmulni_intel rapl drm_kms_helper intel_cstate syscopyarea firewire_ohci sysfillrect sysimgblt ahci fb_sys_fops firewire_core libahci intel_uncore crc_itu_t drm libata mxm_wmi wmi video ip_tables
[  116.565104][    C1] CPU: 1 PID: 56 Comm: kworker/u16:3 Not tainted 5.18.0-rc6-00010-gb47372dac98f #1
[  116.565265][    C1] Hardware name:  /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
[  116.565439][    C1] Workqueue: writeback wb_workfn (flush-8:16)
[ 116.565547][ C1] RIP: 0010:native_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:382) 
[ 116.565669][ C1] Code: 37 48 b8 00 00 00 00 00 fc ff df 49 89 dc 48 89 dd 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 f3 90 41 0f b6 04 24 40 38 c5 7c 08 <84> c0 0f 85 9a 06 00 00 8b 03 84 c0 75 e6 48 b8 00 00 00 00 00 fc
All code
========
   0:	37                   	(bad)  
   1:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
   8:	fc ff df 
   b:	49 89 dc             	mov    %rbx,%r12
   e:	48 89 dd             	mov    %rbx,%rbp
  11:	49 c1 ec 03          	shr    $0x3,%r12
  15:	83 e5 07             	and    $0x7,%ebp
  18:	49 01 c4             	add    %rax,%r12
  1b:	83 c5 03             	add    $0x3,%ebp
  1e:	f3 90                	pause  
  20:	41 0f b6 04 24       	movzbl (%r12),%eax
  25:	40 38 c5             	cmp    %al,%bpl
  28:	7c 08                	jl     0x32
  2a:*	84 c0                	test   %al,%al		<-- trapping instruction
  2c:	0f 85 9a 06 00 00    	jne    0x6cc
  32:	8b 03                	mov    (%rbx),%eax
  34:	84 c0                	test   %al,%al
  36:	75 e6                	jne    0x1e
  38:	48                   	rex.W
  39:	b8 00 00 00 00       	mov    $0x0,%eax
  3e:	00 fc                	add    %bh,%ah

Code starting with the faulting instruction
===========================================
   0:	84 c0                	test   %al,%al
   2:	0f 85 9a 06 00 00    	jne    0x6a2
   8:	8b 03                	mov    (%rbx),%eax
   a:	84 c0                	test   %al,%al
   c:	75 e6                	jne    0xfffffffffffffff4
   e:	48                   	rex.W
   f:	b8 00 00 00 00       	mov    $0x0,%eax
  14:	00 fc                	add    %bh,%ah
[  116.565997][    C1] RSP: 0018:ffffc900004477b0 EFLAGS: 00000206
[  116.566111][    C1] RAX: 0000000000000000 RBX: ffff8884095d7218 RCX: ffffffff812ee01d
[  116.566240][    C1] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff8884095d7218
[  116.566370][    C1] RBP: 0000000000000003 R08: 0000000000000000 R09: ffff8884095d721b
[  116.566499][    C1] R10: ffffed10812bae43 R11: 0000000000000001 R12: ffffed10812bae43
[  116.566628][    C1] R13: ffff888413631180 R14: 0000000000000001 R15: 1ffff92000088ef7
[  116.566758][    C1] FS:  0000000000000000(0000) GS:ffff888361e80000(0000) knlGS:0000000000000000
[  116.566903][    C1] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  116.567031][    C1] CR2: 00007f76e3ce3180 CR3: 000000041d60e004 CR4: 00000000001706e0
[  116.567163][    C1] Call Trace:
[  116.567223][    C1]  <TASK>
[ 116.567279][ C1] ? .slowpath (kernel/locking/qspinlock.c:316) 
[ 116.567357][ C1] ? __get_node_page (fs/f2fs/node.c:35 fs/f2fs/node.c:1410) f2fs
[ 116.567495][ C1] ? _raw_spin_lock (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 116.567584][ C1] _raw_spin_lock (arch/x86/include/asm/paravirt.h:591 arch/x86/include/asm/qspinlock.h:51 include/asm-generic/qspinlock.h:85 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 116.567664][ C1] ? _raw_write_lock_irq (kernel/locking/spinlock.c:153) 
[ 116.567753][ C1] ? __inode_add_lru (arch/x86/include/asm/atomic.h:29 include/linux/atomic/atomic-instrumented.h:28 fs/inode.c:458) 
[ 116.567839][ C1] writeback_sb_inodes (fs/fs-writeback.c:1812) 
[ 116.567927][ C1] ? sync_inode_metadata (fs/fs-writeback.c:1768) 
[ 116.568035][ C1] ? wb_io_lists_populated (arch/x86/include/asm/bitops.h:207 include/asm-generic/bitops/instrumented-non-atomic.h:135 include/linux/backing-dev.h:55 fs/fs-writeback.c:87) 
[ 116.568129][ C1] ? queue_io (fs/fs-writeback.c:1427) 
[ 116.568205][ C1] wb_writeback (fs/fs-writeback.c:2029) 
[ 116.568283][ C1] ? __writeback_inodes_wb (fs/fs-writeback.c:1980) 
[ 116.568376][ C1] ? _raw_spin_lock_bh (arch/x86/include/asm/atomic.h:202 include/linux/atomic/atomic-instrumented.h:543 include/asm-generic/qspinlock.h:82 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:127 kernel/locking/spinlock.c:178) 
[ 116.568463][ C1] ? _raw_spin_lock (kernel/locking/spinlock.c:177) 
[ 116.568548][ C1] ? pointer (lib/vsprintf.c:2737) 
[ 116.568624][ C1] wb_do_writeback (fs/fs-writeback.c:2173) 
[ 116.568706][ C1] ? set_worker_desc (kernel/workqueue.c:4645) 
[ 116.568791][ C1] ? wb_writeback (fs/fs-writeback.c:2165) 
[ 116.568872][ C1] ? update_load_avg (kernel/sched/fair.c:3647 kernel/sched/fair.c:3902) 
[ 116.568958][ C1] wb_workfn (fs/fs-writeback.c:2212 (discriminator 1)) 
[ 116.569052][ C1] ? inode_wait_for_writeback (fs/fs-writeback.c:2196) 
[ 116.569147][ C1] ? __switch_to (arch/x86/include/asm/bitops.h:55 include/asm-generic/bitops/instrumented-atomic.h:29 include/linux/thread_info.h:89 arch/x86/include/asm/fpu/sched.h:65 arch/x86/kernel/process_64.c:622) 
[ 116.569228][ C1] ? __schedule (kernel/sched/core.c:6267) 
[ 116.569311][ C1] ? read_word_at_a_time (include/asm-generic/rwonce.h:86) 
[ 116.569398][ C1] ? strscpy (lib/string.c:203) 
[ 116.569473][ C1] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294) 
[ 116.569558][ C1] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437) 
[ 116.569638][ C1] ? process_one_work (kernel/workqueue.c:2379) 
[ 116.569727][ C1] kthread (kernel/kthread.c:376) 
[ 116.569798][ C1] ? kthread_complete_and_exit (kernel/kthread.c:331) 
[ 116.569895][ C1] ret_from_fork (arch/x86/entry/entry_64.S:304) 
[  116.569984][    C1]  </TASK>
[  116.570050][    C1] Kernel panic - not syncing: softlockup: hung tasks
[  116.570161][    C1] CPU: 1 PID: 56 Comm: kworker/u16:3 Tainted: G             L    5.18.0-rc6-00010-gb47372dac98f #1
[  116.570332][    C1] Hardware name:  /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
[  116.570477][    C1] Workqueue: writeback wb_workfn (flush-8:16)
[  116.570582][    C1] Call Trace:
[  116.570640][    C1]  <IRQ>
[ 116.570694][ C1] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) 
[ 116.570773][ C1] panic (kernel/panic.c:268) 
[ 116.570844][ C1] ? panic_print_sys_info+0x60/0x60 
[ 116.570946][ C1] ? add_taint (arch/x86/include/asm/bitops.h:60 include/asm-generic/bitops/instrumented-atomic.h:29 kernel/panic.c:464) 
[  116.570986][    C7] watchdog: BUG: soft lockup - CPU#7 stuck for 26s! [kworker/u16:5:58]
[  116.570991][    C7] Modules linked in: dm_mod f2fs crc32_generic netconsole btrfs blake2b_generic xor raid6_pq zstd_compress libcrc32c intel_rapl_msr intel_rapl_common i915 sd_mod t10_pi intel_gtt x86_pkg_temp_thermal drm_buddy intel_powerclamp crc64_rocksoft_generic drm_dp_helper crc64_rocksoft crc64 ttm sg coretemp ipmi_devintf kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ipmi_msghandler crc32c_intel ghash_clmulni_intel rapl drm_kms_helper intel_cstate syscopyarea firewire_ohci sysfillrect sysimgblt ahci fb_sys_fops firewire_core libahci intel_uncore crc_itu_t drm libata mxm_wmi wmi video ip_tables
[  116.571041][    C7] CPU: 7 PID: 58 Comm: kworker/u16:5 Tainted: G             L    5.18.0-rc6-00010-gb47372dac98f #1
[  116.571044][    C7] Hardware name:  /DZ77BH-55K, BIOS BHZ7710H.86A.0097.2012.1228.1346 12/28/2012
[  116.571054][    C7] Workqueue: writeback wb_update_bandwidth_workfn
[ 116.571059][ C7] RIP: 0010:native_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:382) 
[ 116.571063][ C7] Code: 37 48 b8 00 00 00 00 00 fc ff df 49 89 dc 48 89 dd 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 f3 90 41 0f b6 04 24 40 38 c5 7c 08 <84> c0 0f 85 9a 06 00 00 8b 03 84 c0 75 e6 48 b8 00 00 00 00 00 fc
All code
========
   0:	37                   	(bad)  
   1:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
   8:	fc ff df 
   b:	49 89 dc             	mov    %rbx,%r12
   e:	48 89 dd             	mov    %rbx,%rbp
  11:	49 c1 ec 03          	shr    $0x3,%r12
  15:	83 e5 07             	and    $0x7,%ebp
  18:	49 01 c4             	add    %rax,%r12
  1b:	83 c5 03             	add    $0x3,%ebp
  1e:	f3 90                	pause  
  20:	41 0f b6 04 24       	movzbl (%r12),%eax
  25:	40 38 c5             	cmp    %al,%bpl
  28:	7c 08                	jl     0x32
  2a:*	84 c0                	test   %al,%al		<-- trapping instruction
  2c:	0f 85 9a 06 00 00    	jne    0x6cc
  32:	8b 03                	mov    (%rbx),%eax
  34:	84 c0                	test   %al,%al
  36:	75 e6                	jne    0x1e
  38:	48                   	rex.W
  39:	b8 00 00 00 00       	mov    $0x0,%eax
  3e:	00 fc                	add    %bh,%ah

Code starting with the faulting instruction
===========================================
   0:	84 c0                	test   %al,%al
   2:	0f 85 9a 06 00 00    	jne    0x6a2
   8:	8b 03                	mov    (%rbx),%eax
   a:	84 c0                	test   %al,%al
   c:	75 e6                	jne    0xfffffffffffffff4
   e:	48                   	rex.W
   f:	b8 00 00 00 00       	mov    $0x0,%eax
  14:	00 fc                	add    %bh,%ah
[  116.571065][    C7] RSP: 0018:ffffc90000467b60 EFLAGS: 00000206
[  116.571068][    C7] RAX: 0000000000000000 RBX: ffff888418e4e0b8 RCX: ffffffff812ee01d
[  116.571069][    C7] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffff888418e4e0b8
[  116.571071][    C7] RBP: 0000000000000003 R08: 0000000000000000 R09: ffff888418e4e0bb
[  116.571072][    C7] R10: ffffed10831c9c17 R11: 0000000000000001 R12: ffffed10831c9c17
[  116.571074][    C7] R13: ffff888418e4e060 R14: 0000000000000001 R15: 1ffff9200008cf6d
[  116.571075][    C7] FS:  0000000000000000(0000) GS:ffff888362180000(0000) knlGS:0000000000000000
[  116.571077][    C7] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  116.571079][    C7] CR2: 00007f8c1d498710 CR3: 000000041d60e003 CR4: 00000000001706e0
[  116.571080][    C7] Call Trace:
[  116.571082][    C7]  <TASK>
[ 116.571083][ C7] ? .slowpath (kernel/locking/qspinlock.c:316) 
[ 116.571086][ C7] ? out_of_line_wait_on_bit_timeout (kernel/sched/completion.c:100) 
[ 116.571089][ C7] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161) 
[ 116.571092][ C7] ? newidle_balance (kernel/sched/fair.c:10172 kernel/sched/fair.c:10183 kernel/sched/fair.c:10945) 
[ 116.571095][ C7] _raw_spin_lock (arch/x86/include/asm/paravirt.h:591 arch/x86/include/asm/qspinlock.h:51 include/asm-generic/qspinlock.h:85 include/linux/spinlock.h:185 include/linux/spinlock_api_smp.h:134 kernel/locking/spinlock.c:154) 
[ 116.571098][ C7] ? _raw_write_lock_irq (kernel/locking/spinlock.c:153) 
[ 116.571100][ C7] ? update_curr (include/linux/cgroup.h:482 include/linux/cgroup.h:551 include/linux/cgroup.h:795 kernel/sched/fair.c:909) 
[ 116.571102][ C7] ? start_poll_synchronize_srcu (kernel/rcu/srcutree.c:913) 
[ 116.571106][ C7] ? load_balance (kernel/sched/fair.c:10902) 
[ 116.571109][ C7] __wb_update_bandwidth (mm/page-writeback.c:1339) 
[ 116.571112][ C7] wb_update_bandwidth (mm/page-writeback.c:1365) 
[ 116.571115][ C7] ? wb_calc_thresh (mm/page-writeback.c:1365) 
[ 116.571117][ C7] ? strscpy (lib/string.c:203) 
[ 116.571120][ C7] process_one_work (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/workqueue.h:108 kernel/workqueue.c:2294) 
[ 116.571124][ C7] worker_thread (include/linux/list.h:292 kernel/workqueue.c:2437) 
[ 116.571126][ C7] ? process_one_work (kernel/workqueue.c:2379) 
[ 116.571129][ C7] kthread (kernel/kthread.c:376) 
[ 116.571131][ C7] ? kthread_complete_and_exit (kernel/kthread.c:331) 
[ 116.571133][ C7] ret_from_fork (arch/x86/entry/entry_64.S:304) 
[  116.571137][    C7]  </TASK>
[ 116.575850][ C1] watchdog_timer_fn.cold (kernel/watchdog.c:435) 
[ 116.575943][ C1] ? lockup_detector_update_enable (kernel/watchdog.c:355) 
[ 116.576066][ C1] __hrtimer_run_queues (kernel/time/hrtimer.c:1685 kernel/time/hrtimer.c:1749) 
[ 116.576157][ C1] ? __remove_hrtimer (kernel/time/hrtimer.c:1719) 
[ 116.576245][ C1] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161) 
[ 116.576343][ C1] ? recalibrate_cpu_khz (arch/x86/include/asm/msr.h:224 arch/x86/kernel/tsc.c:1095) 
[ 116.576431][ C1] ? ktime_get_update_offsets_now (kernel/time/timekeeping.c:290 (discriminator 3) kernel/time/timekeeping.c:386 (discriminator 3) kernel/time/timekeeping.c:2299 (discriminator 3)) 
[ 116.576533][ C1] hrtimer_interrupt (kernel/time/hrtimer.c:1814) 
[ 116.576618][ C1] __sysvec_apic_timer_interrupt (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 arch/x86/include/asm/trace/irq_vectors.h:41 arch/x86/kernel/apic/apic.c:1104) 
[ 116.576721][ C1] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1097 (discriminator 14)) 
[  116.576819][    C1]  </IRQ>
[  116.576874][    C1]  <TASK>
[ 116.576929][ C1] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:645) 
[ 116.577050][ C1] RIP: 0010:native_queued_spin_lock_slowpath (kernel/locking/qspinlock.c:382) 
[ 116.577171][ C1] Code: 37 48 b8 00 00 00 00 00 fc ff df 49 89 dc 48 89 dd 49 c1 ec 03 83 e5 07 49 01 c4 83 c5 03 f3 90 41 0f b6 04 24 40 38 c5 7c 08 <84> c0 0f 85 9a 06 00 00 8b 03 84 c0 75 e6 48 b8 00 00 00 00 00 fc
All code
========
   0:	37                   	(bad)  
   1:	48 b8 00 00 00 00 00 	movabs $0xdffffc0000000000,%rax
   8:	fc ff df 
   b:	49 89 dc             	mov    %rbx,%r12
   e:	48 89 dd             	mov    %rbx,%rbp
  11:	49 c1 ec 03          	shr    $0x3,%r12
  15:	83 e5 07             	and    $0x7,%ebp
  18:	49 01 c4             	add    %rax,%r12
  1b:	83 c5 03             	add    $0x3,%ebp
  1e:	f3 90                	pause  
  20:	41 0f b6 04 24       	movzbl (%r12),%eax
  25:	40 38 c5             	cmp    %al,%bpl
  28:	7c 08                	jl     0x32
  2a:*	84 c0                	test   %al,%al		<-- trapping instruction
  2c:	0f 85 9a 06 00 00    	jne    0x6cc
  32:	8b 03                	mov    (%rbx),%eax
  34:	84 c0                	test   %al,%al
  36:	75 e6                	jne    0x1e
  38:	48                   	rex.W
  39:	b8 00 00 00 00       	mov    $0x0,%eax
  3e:	00 fc                	add    %bh,%ah

Code starting with the faulting instruction
===========================================
   0:	84 c0                	test   %al,%al
   2:	0f 85 9a 06 00 00    	jne    0x6a2
   8:	8b 03                	mov    (%rbx),%eax
   a:	84 c0                	test   %al,%al
   c:	75 e6                	jne    0xfffffffffffffff4
   e:	48                   	rex.W
   f:	b8 00 00 00 00       	mov    $0x0,%eax
  14:	00 fc                	add    %bh,%ah


To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.18.0-rc6-00010-gb47372dac98f" of type "text/plain" (166090 bytes)

View attachment "job-script" of type "text/plain" (6022 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (5832 bytes)

View attachment "xfstests" of type "text/plain" (20315 bytes)

View attachment "job.yaml" of type "text/plain" (4712 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ