lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Date:   Wed, 30 Mar 2022 18:15:54 +0800
From:   kernel test robot <oliver.sang@...el.com>
To:     Nicolas Saenz Julienne <nsaenzju@...hat.com>
Cc:     lkp@...ts.01.org, lkp@...el.com,
        LKML <linux-kernel@...r.kernel.org>
Subject: [mm/page_alloc]  d74b7e8fef:
 BUG:spinlock_trylock_failure_on_UP_on_CPU



Greeting,

FYI, we noticed the following commit (built with clang-15):

commit: d74b7e8fef5d9d1c6d7aba7b2ac898f77081a18a ("mm/page_alloc: Avoid disabling interruptions on hot paths")
https://git.kernel.org/cgit/linux/kernel/git/nsaenz/linux-rpi.git pcpdrain-sl-v3r1

in testcase: boot

on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@...el.com>


[  595.240111][    C0] BUG: spinlock trylock failure on UP on CPU#0, boot-1-aliyun-x/3533
[  595.241444][    C0]  lock: 0xffff88843ffdcc70, .magic: dead4ead, .owner: boot-1-aliyun-x/3533, .owner_cpu: 0
[  595.243114][    C0] CPU: 0 PID: 3533 Comm: boot-1-aliyun-x Not tainted 5.17.0-04447-gd74b7e8fef5d #1
[  595.244782][    C0] Call Trace:
[  595.245358][    C0]  <IRQ>
[ 595.245884][ C0] dump_stack_lvl (lib/dump_stack.c:108) 
[ 595.246661][ C0] dump_stack (lib/dump_stack.c:114) 
[ 595.247387][ C0] spin_bug (kernel/locking/spinlock_debug.c:? kernel/locking/spinlock_debug.c:77) 
[ 595.248093][ C0] do_raw_spin_trylock (include/linux/spinlock_up.h:42 kernel/locking/spinlock_debug.c:122) 
[ 595.249033][ C0] _raw_spin_trylock (include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138) 
[ 595.249903][ C0] __rmqueue_pcplist (include/linux/spinlock.h:359 mm/page_alloc.c:3547) 
[ 595.250816][ C0] ? tcp_data_ready (net/ipv4/tcp_input.c:4980) 
[ 595.251730][ C0] ? tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2127) 
[ 595.252576][ C0] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) 
[ 595.253620][ C0] ? zone_watermark_fast (mm/page_alloc.c:3763 mm/page_alloc.c:3876) 
[ 595.254595][ C0] get_page_from_freelist (mm/page_alloc.c:3603 mm/page_alloc.c:3631 mm/page_alloc.c:4096) 
[ 595.255550][ C0] ? validate_chain (kernel/locking/lockdep.c:3696 kernel/locking/lockdep.c:3716 kernel/locking/lockdep.c:3771) 
[ 595.256470][ C0] ? fs_reclaim_release (include/linux/sched/mm.h:190 mm/page_alloc.c:4516) 
[ 595.257434][ C0] ? prepare_alloc_pages (include/linux/mmzone.h:1194 include/linux/mmzone.h:1220 mm/page_alloc.c:5115) 
[ 595.258394][ C0] __alloc_pages (mm/page_alloc.c:?) 
[ 595.259246][ C0] ? slob_alloc (mm/slob.c:358) 
[ 595.260089][ C0] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) 
[ 595.261077][ C0] ? lockdep_hardirqs_on_prepare (kernel/locking/lockdep.c:4208 kernel/locking/lockdep.c:4226 kernel/locking/lockdep.c:4294) 
[ 595.262178][ C0] slob_new_pages (mm/slob.c:202) 
[ 595.263008][ C0] slob_alloc (mm/slob.c:360) 
[ 595.263786][ C0] ? __napi_alloc_skb (net/core/skbuff.c:570) 
[ 595.264769][ C0] __kmalloc_track_caller (mm/slob.c:503 mm/slob.c:533) 
[ 595.265774][ C0] ? __napi_alloc_skb (net/core/skbuff.c:570) 
[ 595.266726][ C0] ? __napi_alloc_skb (net/core/skbuff.c:570) 
[ 595.267561][ C0] __alloc_skb (net/core/skbuff.c:354 net/core/skbuff.c:426) 
[ 595.268373][ C0] __napi_alloc_skb (net/core/skbuff.c:570) 
[ 595.269275][ C0] e1000_clean_rx_irq (include/linux/skbuff.h:3005 drivers/net/ethernet/intel/e1000/e1000_main.c:4112 drivers/net/ethernet/intel/e1000/e1000_main.c:4331 drivers/net/ethernet/intel/e1000/e1000_main.c:4383) 
[ 595.270213][ C0] ? e1000_alloc_jumbo_rx_buffers (drivers/net/ethernet/intel/e1000/e1000_main.c:4353) 
[ 595.271290][ C0] e1000_clean (drivers/net/ethernet/intel/e1000/e1000_main.c:3933 drivers/net/ethernet/intel/e1000/e1000_main.c:3801) 
[ 595.272132][ C0] ? __lock_acquire (kernel/locking/lockdep.c:?) 
[ 595.273066][ C0] __napi_poll (net/core/dev.c:6365) 
[ 595.273844][ C0] net_rx_action (net/core/dev.c:6432 net/core/dev.c:6519) 
[ 595.274692][ C0] __do_softirq (arch/x86/include/asm/atomic.h:29 include/linux/atomic/atomic-instrumented.h:28 include/linux/jump_label.h:261 include/linux/jump_label.h:271 include/trace/events/irq.h:142 kernel/softirq.c:559) 
[ 595.275537][ C0] ? handle_fasteoi_irq (kernel/irq/chip.c:722) 
[ 595.276450][ C0] __irq_exit_rcu (kernel/softirq.c:640) 
[ 595.277246][ C0] irq_exit_rcu (kernel/softirq.c:651) 
[ 595.278011][ C0] common_interrupt (arch/x86/kernel/irq.c:240) 
[  595.278890][    C0]  </IRQ>
[  595.279428][    C0]  <TASK>
[ 595.279974][ C0] asm_common_interrupt (??:?) 
[ 595.280922][ C0] RIP: 0010:kcsan_setup_watchpoint (kernel/kcsan/core.c:357 kernel/kcsan/core.c:693) 
[ 595.282042][ C0] Code: 95 fb ff 48 c7 45 c8 00 00 00 00 9c 8f 45 c8 f7 45 c8 00 02 00 00 0f 85 90 00 00 00 f7 45 98 00 02 00 00 74 01 fb 48 8b 43 30 <49> 89 47 30 48 8b 43 28 49 89 47 28 48 8b 43 20 49 89 47 20 48 8b
All code
========
   0:	95                   	xchg   %eax,%ebp
   1:	fb                   	sti    
   2:	ff 48 c7             	decl   -0x39(%rax)
   5:	45 c8 00 00 00       	rex.RB enterq $0x0,$0x0
   a:	00 9c 8f 45 c8 f7 45 	add    %bl,0x45f7c845(%rdi,%rcx,4)
  11:	c8 00 02 00          	enterq $0x200,$0x0
  15:	00 0f                	add    %cl,(%rdi)
  17:	85 90 00 00 00 f7    	test   %edx,-0x9000000(%rax)
  1d:	45 98                	rex.RB cwtl 
  1f:	00 02                	add    %al,(%rdx)
  21:	00 00                	add    %al,(%rax)
  23:	74 01                	je     0x26
  25:	fb                   	sti    
  26:	48 8b 43 30          	mov    0x30(%rbx),%rax
  2a:*	49 89 47 30          	mov    %rax,0x30(%r15)		<-- trapping instruction
  2e:	48 8b 43 28          	mov    0x28(%rbx),%rax
  32:	49 89 47 28          	mov    %rax,0x28(%r15)
  36:	48 8b 43 20          	mov    0x20(%rbx),%rax
  3a:	49 89 47 20          	mov    %rax,0x20(%r15)
  3e:	48                   	rex.W
  3f:	8b                   	.byte 0x8b

Code starting with the faulting instruction
===========================================
   0:	49 89 47 30          	mov    %rax,0x30(%r15)
   4:	48 8b 43 28          	mov    0x28(%rbx),%rax
   8:	49 89 47 28          	mov    %rax,0x28(%r15)
   c:	48 8b 43 20          	mov    0x20(%rbx),%rax
  10:	49 89 47 20          	mov    %rax,0x20(%r15)
  14:	48                   	rex.W
  15:	8b                   	.byte 0x8b
[  595.285406][    C0] RSP: 0000:ffffc90001a378a0 EFLAGS: 00000206
[  595.286469][    C0] RAX: 0000231c0000231a RBX: ffff88816bcc60e0 RCX: 0000000000000000
[  595.287838][    C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[  595.289253][    C0] RBP: ffffc90001a37918 R08: 0000000000000000 R09: 0000000000292c74
[  595.290514][    C0] R10: 0001ffffffffff00 R11: 0000000000000000 R12: 0000000000000008
[  595.291885][    C0] R13: 0000000000000000 R14: ffff88816bcc60b0 R15: ffff88816bcc4798
[ 595.293348][ C0] ? __list_del_entry_valid (lib/list_debug.c:45) 
[ 595.294280][ C0] __tsan_read_write8 (kernel/kcsan/core.c:? kernel/kcsan/core.c:1014) 
[ 595.295129][ C0] __list_del_entry_valid (lib/list_debug.c:45) 
[ 595.296047][ C0] __rmqueue_pcplist (include/linux/list.h:134 include/linux/list.h:148 mm/page_alloc.c:3574) 
[ 595.296969][ C0] ? validate_chain (kernel/locking/lockdep.c:3696 kernel/locking/lockdep.c:3716 kernel/locking/lockdep.c:3771) 
[ 595.297812][ C0] get_page_from_freelist (mm/page_alloc.c:3603 mm/page_alloc.c:3631 mm/page_alloc.c:4096) 
[ 595.298786][ C0] ? __cond_resched (arch/x86/include/asm/preempt.h:103 kernel/sched/core.c:8153) 
[ 595.299605][ C0] ? prepare_alloc_pages (include/linux/mmzone.h:1194 include/linux/mmzone.h:1220 mm/page_alloc.c:5115) 
[ 595.300540][ C0] __alloc_pages (mm/page_alloc.c:?) 
[ 595.301395][ C0] ? wp_page_copy (include/linux/mmu_notifier.h:491 mm/memory.c:3125) 
[ 595.302244][ C0] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) 
[ 595.303370][ C0] ? __lock_acquire (kernel/locking/lockdep.c:?) 
[ 595.304245][ C0] wp_page_copy (include/linux/gfp.h:? include/linux/gfp.h:595 include/linux/gfp.h:609 mm/memory.c:3018) 
[ 595.305098][ C0] ? rcu_read_lock_sched_held (include/linux/lockdep.h:283 kernel/rcu/update.c:125) 
[ 595.306107][ C0] ? handle_mm_fault (include/linux/spinlock.h:? mm/memory.c:3317 mm/memory.c:4586 mm/memory.c:4704 mm/memory.c:4802) 
[ 595.307071][ C0] ? do_raw_spin_unlock (include/linux/spinlock_up.h:48 kernel/locking/spinlock_debug.c:141) 
[ 595.308065][ C0] handle_mm_fault (mm/memory.c:3318 mm/memory.c:4586 mm/memory.c:4704 mm/memory.c:4802) 
[ 595.308924][ C0] do_user_addr_fault (arch/x86/mm/fault.c:?) 
[ 595.309774][ C0] exc_page_fault (arch/x86/include/asm/irqflags.h:22 arch/x86/include/asm/irqflags.h:70 arch/x86/include/asm/irqflags.h:130 arch/x86/mm/fault.c:1492 arch/x86/mm/fault.c:1540) 
[ 595.310587][ C0] ? asm_exc_page_fault (??:?) 
[ 595.311378][ C0] asm_exc_page_fault (??:?) 
[  595.312204][    C0] RIP: 0033:0x42e32f
[ 595.312952][ C0] Code: 2e 0f 1f 84 00 00 00 00 00 66 90 53 48 89 fb 48 8b 3f 48 85 ff 74 05 e8 ff cb fe ff 8b 05 31 9c 2b 00 39 05 2f 9c 2b 00 7d 61 <c6> 03 df c6 43 01 df c6 43 02 df c6 43 03 df c6 43 04 df c6 43 05
All code
========
   0:	2e 0f 1f 84 00 00 00 	nopl   %cs:0x0(%rax,%rax,1)
   7:	00 00 
   9:	66 90                	xchg   %ax,%ax
   b:	53                   	push   %rbx
   c:	48 89 fb             	mov    %rdi,%rbx
   f:	48 8b 3f             	mov    (%rdi),%rdi
  12:	48 85 ff             	test   %rdi,%rdi
  15:	74 05                	je     0x1c
  17:	e8 ff cb fe ff       	callq  0xfffffffffffecc1b
  1c:	8b 05 31 9c 2b 00    	mov    0x2b9c31(%rip),%eax        # 0x2b9c53
  22:	39 05 2f 9c 2b 00    	cmp    %eax,0x2b9c2f(%rip)        # 0x2b9c57
  28:	7d 61                	jge    0x8b
  2a:*	c6 03 df             	movb   $0xdf,(%rbx)		<-- trapping instruction
  2d:	c6 43 01 df          	movb   $0xdf,0x1(%rbx)
  31:	c6 43 02 df          	movb   $0xdf,0x2(%rbx)
  35:	c6 43 03 df          	movb   $0xdf,0x3(%rbx)
  39:	c6 43 04 df          	movb   $0xdf,0x4(%rbx)
  3d:	c6                   	.byte 0xc6
  3e:	43                   	rex.XB
  3f:	05                   	.byte 0x5

Code starting with the faulting instruction
===========================================
   0:	c6 03 df             	movb   $0xdf,(%rbx)
   3:	c6 43 01 df          	movb   $0xdf,0x1(%rbx)
   7:	c6 43 02 df          	movb   $0xdf,0x2(%rbx)
   b:	c6 43 03 df          	movb   $0xdf,0x3(%rbx)
   f:	c6 43 04 df          	movb   $0xdf,0x4(%rbx)
  13:	c6                   	.byte 0xc6
  14:	43                   	rex.XB
  15:	05                   	.byte 0x5


To reproduce:

        # build kernel
	cd linux
	cp config-5.17.0-04447-gd74b7e8fef5d .config
	make HOSTCC=clang-15 CC=clang-15 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage modules
	make HOSTCC=clang-15 CC=clang-15 ARCH=x86_64 INSTALL_MOD_PATH=<mod-install-dir> modules_install
	cd <mod-install-dir>
	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.



-- 
0-DAY CI Kernel Test Service
https://01.org/lkp



View attachment "config-5.17.0-04447-gd74b7e8fef5d" of type "text/plain" (132157 bytes)

View attachment "job-script" of type "text/plain" (4657 bytes)

Download attachment "dmesg.xz" of type "application/x-xz" (15800 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ