lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202407291328.761441a7-oliver.sang@intel.com>
Date: Mon, 29 Jul 2024 14:14:52 +0800
From: kernel test robot <oliver.sang@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
CC: <oe-lkp@...ts.linux.dev>, <lkp@...el.com>, <linux-kernel@...r.kernel.org>,
	<aubrey.li@...ux.intel.com>, <yu.c.chen@...el.com>, <oliver.sang@...el.com>
Subject: [peterz-queue:sched/prep] [sched/fair]  124c8f4374:
 WARNING:at_kernel/sched/sched.h:#update_load_avg



Hello,

kernel test robot noticed "WARNING:at_kernel/sched/sched.h:#update_load_avg" on:

commit: 124c8f43740f3f1e3e2aeecd976bcf5ddd7bd435 ("sched/fair: Implement delayed dequeue")
https://git.kernel.org/cgit/linux/kernel/git/peterz/queue.git sched/prep

in testcase: boot

compiler: clang-18
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


we noticed the issue does not always happen, 60 times out of 150 runs as below.
but the parent keeps clean.

a1133ac6dc7c6bde 124c8f43740f3f1e3e2aeecd976
---------------- ---------------------------
       fail:runs  %reproduction    fail:runs
           |             |             |
           :150         40%          60:150   dmesg.WARNING:at_kernel/sched/sched.h:#update_load_avg



If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@...el.com>
| Closes: https://lore.kernel.org/oe-lkp/202407291328.761441a7-oliver.sang@intel.com


[   13.522527][    C0] ------------[ cut here ]------------
[   13.523046][    C0] rq->clock_update_flags < RQCF_ACT_SKIP
[ 13.523055][ C0] WARNING: CPU: 0 PID: 158 at kernel/sched/sched.h:1550 update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[   13.524265][    C0] Modules linked in: drm fuse ip_tables
[   13.524722][    C0] CPU: 0 UID: 0 PID: 158 Comm: openipmi Not tainted 6.10.0-12745-g124c8f43740f #1
[   13.525464][    C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 13.526307][ C0] RIP: 0010:update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[ 13.526793][ C0] Code: 00 4c 2b bb 88 01 00 00 40 f6 c5 02 0f 84 b7 fa ff ff e9 ca fa ff ff c6 05 b9 1b e4 01 01 48 c7 c7 34 74 d6 a6 e8 a1 c7 fa ff <0f> 0b e9 78 fa ff ff 65 8b 0d 8f 83 91 5a 89 c9 48 0f a3 0d e9 7b
All code
========
   0:	00 4c 2b bb          	add    %cl,-0x45(%rbx,%rbp,1)
   4:	88 01                	mov    %al,(%rcx)
   6:	00 00                	add    %al,(%rax)
   8:	40 f6 c5 02          	test   $0x2,%bpl
   c:	0f 84 b7 fa ff ff    	je     0xfffffffffffffac9
  12:	e9 ca fa ff ff       	jmpq   0xfffffffffffffae1
  17:	c6 05 b9 1b e4 01 01 	movb   $0x1,0x1e41bb9(%rip)        # 0x1e41bd7
  1e:	48 c7 c7 34 74 d6 a6 	mov    $0xffffffffa6d67434,%rdi
  25:	e8 a1 c7 fa ff       	callq  0xfffffffffffac7cb
  2a:*	0f 0b                	ud2    		<-- trapping instruction
  2c:	e9 78 fa ff ff       	jmpq   0xfffffffffffffaa9
  31:	65 8b 0d 8f 83 91 5a 	mov    %gs:0x5a91838f(%rip),%ecx        # 0x5a9183c7
  38:	89 c9                	mov    %ecx,%ecx
  3a:	48                   	rex.W
  3b:	0f                   	.byte 0xf
  3c:	a3                   	.byte 0xa3
  3d:	0d                   	.byte 0xd
  3e:	e9                   	.byte 0xe9
  3f:	7b                   	.byte 0x7b

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2    
   2:	e9 78 fa ff ff       	jmpq   0xfffffffffffffa7f
   7:	65 8b 0d 8f 83 91 5a 	mov    %gs:0x5a91838f(%rip),%ecx        # 0x5a91839d
   e:	89 c9                	mov    %ecx,%ecx
  10:	48                   	rex.W
  11:	0f                   	.byte 0xf
  12:	a3                   	.byte 0xa3
  13:	0d                   	.byte 0xd
  14:	e9                   	.byte 0xe9
  15:	7b                   	.byte 0x7b
[   13.528234][    C0] RSP: 0000:ffffbaef40003d90 EFLAGS: 00010046
[   13.528723][    C0] RAX: b53834d2850b7b00 RBX: ffff9ef2efd34640 RCX: 0000000000000027
[   13.529379][    C0] RDX: 0000000000000002 RSI: 00000000ffff7fff RDI: ffff9ef2efc20b88
[   13.530029][    C0] RBP: 0000000000000001 R08: 0000000000007fff R09: ffffffffa7053610
[   13.530685][    C0] R10: 0000000000017ffd R11: 0000000000000004 R12: ffff9ef2efd345c0
[   13.531344][    C0] R13: ffff9ef2efd34640 R14: ffff9eefc0e26400 R15: 0000000000000000
[   13.532001][    C0] FS:  0000000000000000(0000) GS:ffff9ef2efc00000(0063) knlGS:00000000f7f5f180
[   13.532720][    C0] CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
[   13.533242][    C0] CR2: 00000000577c51ac CR3: 0000000162e6c000 CR4: 00000000000406f0
[   13.533891][    C0] Call Trace:
[   13.534215][    C0]  <IRQ>
[ 13.534504][ C0] ? __warn (kernel/panic.c:240 kernel/panic.c:735) 
[ 13.534863][ C0] ? update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[ 13.535278][ C0] ? report_bug (lib/bug.c:? lib/bug.c:219) 
[ 13.535695][ C0] ? handle_bug (arch/x86/kernel/traps.c:239) 
[ 13.536079][ C0] ? exc_invalid_op (arch/x86/kernel/traps.c:260) 
[ 13.536498][ C0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621) 
[ 13.536920][ C0] ? update_load_avg (kernel/sched/sched.h:1550 kernel/sched/pelt.h:67 kernel/sched/pelt.h:174 kernel/sched/fair.c:4747) 
[ 13.537334][ C0] dequeue_entity (kernel/sched/sched.h:817 kernel/sched/fair.c:5429) 
[ 13.537729][ C0] dequeue_entities (kernel/sched/fair.c:6972) 
[ 13.538130][ C0] unregister_fair_sched_group (kernel/sched/fair.c:379 kernel/sched/fair.c:13159) 
[ 13.554647][ C0] ? __pfx_sched_unregister_group_rcu (kernel/sched/core.c:8759) 
[ 13.555167][ C0] sched_unregister_group_rcu (kernel/sched/core.c:8707 kernel/sched/core.c:8761) 
[ 13.555621][ C0] rcu_do_batch (arch/x86/include/asm/preempt.h:26 kernel/rcu/tree.c:2576) 
[ 13.556008][ C0] rcu_core (kernel/rcu/tree.c:2845) 
[ 13.556373][ C0] handle_softirqs (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 include/trace/events/irq.h:142 kernel/softirq.c:555) 
[ 13.556775][ C0] __irq_exit_rcu (kernel/softirq.c:617 kernel/softirq.c:639) 
[ 13.557166][ C0] sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1043 arch/x86/kernel/apic/apic.c:1043) 
[   13.557638][    C0]  </IRQ>
[   13.557920][    C0]  <TASK>
[ 13.560463][ C0] asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702) 
[ 13.560957][ C0] RIP: 0010:irqentry_exit_to_user_mode (kernel/entry/common.c:? include/linux/entry-common.h:328 kernel/entry/common.c:231) 
[ 13.561465][ C0] Code: 90 41 57 41 56 53 48 89 fb e8 63 90 1f ff 65 4c 8b 34 25 40 37 03 00 49 8b 3e f7 c7 0e 30 02 00 74 7c 49 89 ff e8 97 8f 1f ff <4c> 89 f8 a8 08 75 38 a9 00 10 00 00 75 40 a9 00 20 00 00 75 4b a9
All code
========
   0:	90                   	nop
   1:	41 57                	push   %r15
   3:	41 56                	push   %r14
   5:	53                   	push   %rbx
   6:	48 89 fb             	mov    %rdi,%rbx
   9:	e8 63 90 1f ff       	callq  0xffffffffff1f9071
   e:	65 4c 8b 34 25 40 37 	mov    %gs:0x33740,%r14
  15:	03 00 
  17:	49 8b 3e             	mov    (%r14),%rdi
  1a:	f7 c7 0e 30 02 00    	test   $0x2300e,%edi
  20:	74 7c                	je     0x9e
  22:	49 89 ff             	mov    %rdi,%r15
  25:	e8 97 8f 1f ff       	callq  0xffffffffff1f8fc1
  2a:*	4c 89 f8             	mov    %r15,%rax		<-- trapping instruction
  2d:	a8 08                	test   $0x8,%al
  2f:	75 38                	jne    0x69
  31:	a9 00 10 00 00       	test   $0x1000,%eax
  36:	75 40                	jne    0x78
  38:	a9 00 20 00 00       	test   $0x2000,%eax
  3d:	75 4b                	jne    0x8a
  3f:	a9                   	.byte 0xa9

Code starting with the faulting instruction
===========================================
   0:	4c 89 f8             	mov    %r15,%rax
   3:	a8 08                	test   $0x8,%al
   5:	75 38                	jne    0x3f
   7:	a9 00 10 00 00       	test   $0x1000,%eax
   c:	75 40                	jne    0x4e
   e:	a9 00 20 00 00       	test   $0x2000,%eax
  13:	75 4b                	jne    0x60
  15:	a9                   	.byte 0xa9
[   13.562938][    C0] RSP: 0000:ffffbaef40613f38 EFLAGS: 00000202
[   13.563444][    C0] RAX: ffff9ef0287dd080 RBX: ffffbaef40613f58 RCX: 000000000000080b
[   13.564125][    C0] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000020000008
[   13.564796][    C0] RBP: 0000000000000000 R08: 000000000000002b R09: 000000000000002b
[   13.565456][    C0] R10: 0000000000000296 R11: 00000000ff8772ac R12: 0000000000000000
[   13.566115][    C0] R13: 0000000000000000 R14: ffff9ef0287dd080 R15: 0000000020000008
[ 13.566782][ C0] ? irqentry_exit_to_user_mode (kernel/entry/common.c:? include/linux/entry-common.h:328 kernel/entry/common.c:231) 
[ 13.567254][ C0] asm_sysvec_reschedule_ipi (arch/x86/include/asm/idtentry.h:707) 
[   13.567709][    C0] RIP: 0023:0x56652630
[ 13.568072][ C0] Code: ff ff 83 c4 10 85 c0 74 05 83 c4 08 5b c3 83 ec 0c 8d 83 8d 6b ff ff 50 e8 cd 7c ff ff 8d b4 26 00 00 00 00 8d b6 00 00 00 00 <55> 57 56 53 e8 e7 61 ff ff 81 c3 13 38 01 00 83 ec 0c 8b 44 24 20
All code
========
   0:	ff                   	(bad)  
   1:	ff 83 c4 10 85 c0    	incl   -0x3f7aef3c(%rbx)
   7:	74 05                	je     0xe
   9:	83 c4 08             	add    $0x8,%esp
   c:	5b                   	pop    %rbx
   d:	c3                   	retq   
   e:	83 ec 0c             	sub    $0xc,%esp
  11:	8d 83 8d 6b ff ff    	lea    -0x9473(%rbx),%eax
  17:	50                   	push   %rax
  18:	e8 cd 7c ff ff       	callq  0xffffffffffff7cea
  1d:	8d b4 26 00 00 00 00 	lea    0x0(%rsi,%riz,1),%esi
  24:	8d b6 00 00 00 00    	lea    0x0(%rsi),%esi
  2a:*	55                   	push   %rbp		<-- trapping instruction
  2b:	57                   	push   %rdi
  2c:	56                   	push   %rsi
  2d:	53                   	push   %rbx
  2e:	e8 e7 61 ff ff       	callq  0xffffffffffff621a
  33:	81 c3 13 38 01 00    	add    $0x13813,%ebx
  39:	83 ec 0c             	sub    $0xc,%esp
  3c:	8b 44 24 20          	mov    0x20(%rsp),%eax

Code starting with the faulting instruction
===========================================
   0:	55                   	push   %rbp
   1:	57                   	push   %rdi
   2:	56                   	push   %rsi
   3:	53                   	push   %rbx
   4:	e8 e7 61 ff ff       	callq  0xffffffffffff61f0
   9:	81 c3 13 38 01 00    	add    $0x13813,%ebx
   f:	83 ec 0c             	sub    $0xc,%esp
  12:	8b 44 24 20          	mov    0x20(%rsp),%eax


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20240729/202407291328.761441a7-oliver.sang@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ