lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20171007020723.lj2mitq2apgsyo5s@sasha-lappy>
Date:   Sat, 7 Oct 2017 02:07:26 +0000
From:   "Levin, Alexander (Sasha Levin)" <alexander.levin@...izon.com>
To:     Peter Zijlstra <peterz@...radead.org>
CC:     Sasha Levin <levinsasha928@...il.com>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        "linux-kernel@...r.kernel.org List" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/4] sched: WARN when migrating to an offline CPU

On Fri, Sep 29, 2017 at 01:11:26PM +0200, Peter Zijlstra wrote:
>I can't seem to trigger :-(
>
>Can you please run with the below patch and:
>
>  # echo 1 > /proc/sys/kernel/traceoff_on_warning

The call stack trace looks like so:

[ 2073.492089] Unregister pv shared memory for cpu 2
[ 2073.495414] NOHZ: local_softirq_pending 202
[ 2073.516166] ------------[ cut here ]------------
[ 2073.519047] WARNING: CPU: 2 PID: 24 at kernel/sched/core.c:1178 set_task_cpu (kernel/sched/core.c:1157) 
[ 2073.520317] Modules linked in:
[ 2073.520816] CPU: 2 PID: 24 Comm: migration/2 Not tainted 4.14.0-rc2-next-20170927+ #256
[ 2073.522046] task: ffff8801ab7cc040 task.stack: ffff8801ab7d8000
[ 2073.522963] RIP: 0010:set_task_cpu (??:?) 
[ 2073.523655] RSP: 0018:ffff8801ab807b20 EFLAGS: 00010096
[ 2073.524662] RAX: 0000000000000002 RBX: ffff880189810040 RCX: 1ffff100356f993a
[ 2073.525747] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
[ 2073.526845] RBP: ffff8801ab807b60 R08: 0000000000000000 R09: 0000000000000000
[ 2073.527940] R10: 0000000000000000 R11: 0000000000000000 R12: 00000000001e5900
[ 2073.529027] R13: 0000000000000006 R14: 0000000000000002 R15: ffff8801898100a4
[ 2073.530118] FS:  0000000000000000(0000) GS:ffff8801ab800000(0000) knlGS:0000000000000000
[ 2073.531556] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2073.532443] CR2: 00007ffd965f1be8 CR3: 0000000187ede001 CR4: 00000000000606a0
[ 2073.533546] Call Trace:
[ 2073.533944]  <IRQ>
[ 2073.534280] detach_task.isra.80 (kernel/sched/fair.c:6816) 
[ 2073.534939] load_balance (./include/linux/list.h:78 kernel/sched/fair.c:6902 kernel/sched/fair.c:8204) 
[ 2073.535543] ? find_busiest_group (kernel/sched/fair.c:8133) 
[ 2073.536252] rebalance_domains (kernel/sched/fair.c:8838) 
[ 2073.536903] ? pick_next_task_fair (kernel/sched/fair.c:8787) 
[ 2073.537628] ? check_preemption_disabled (lib/smp_processor_id.c:52) 
[ 2073.538395] run_rebalance_domains (kernel/sched/fair.c:9062) 
[ 2073.539096] ? check_preemption_disabled (lib/smp_processor_id.c:52) 
[ 2073.539850] __do_softirq (kernel/softirq.c:284 ./include/linux/jump_label.h:141 ./include/trace/events/irq.h:141 kernel/softirq.c:285) 
[ 2073.540424] ? check_preemption_disabled (lib/smp_processor_id.c:52) 
[ 2073.541189] irq_exit (kernel/softirq.c:364 kernel/softirq.c:405) 
[ 2073.541715] smp_apic_timer_interrupt (./arch/x86/include/asm/irq_regs.h:26 arch/x86/kernel/apic/apic.c:1043) 
[ 2073.542452] apic_timer_interrupt (arch/x86/entry/entry_64.S:770) 
[ 2073.543116]  </IRQ>
[ 2073.543463] RIP: 0010:multi_cpu_stop (??:?) 
[ 2073.544173] RSP: 0018:ffff8801ab7dfd88 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff11
[ 2073.545326] RAX: 0000000000000007 RBX: 0000000000000004 RCX: 000000000000000a
[ 2073.546406] RDX: 0000000000000000 RSI: ffffffff9c46eee0 RDI: 0000000000000292
[ 2073.547503] RBP: ffff8801ab7dfde0 R08: 0000000000000000 R09: 0000000000000000
[ 2073.548587] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88012aaf7890
[ 2073.549668] R13: ffff88012aaf78b4 R14: dffffc0000000000 R15: 0000000000000003
[ 2073.550765] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) 
[ 2073.551501] ? cpu_stop_queue_work (kernel/stop_machine.c:176) 
[ 2073.552195] cpu_stopper_thread (kernel/stop_machine.c:480) 
[ 2073.552845] ? cpu_stop_create (kernel/stop_machine.c:458) 
[ 2073.553462] smpboot_thread_fn (kernel/smpboot.c:164) 
[ 2073.554104] ? sort_range (kernel/smpboot.c:107) 
[ 2073.554671] ? schedule (./arch/x86/include/asm/bitops.h:324 (discriminator 1) ./include/linux/thread_info.h:79 (discriminator 1) ./include/linux/sched.h:1605 (discriminator 1) kernel/sched/core.c:3435 (discriminator 1)) 
[ 2073.555230] ? __kthread_parkme (kernel/kthread.c:188) 
[ 2073.555892] kthread (kernel/kthread.c:242) 
[ 2073.556408] ? sort_range (kernel/smpboot.c:107) 
[ 2073.556963] ? kthread_create_on_node (kernel/kthread.c:198) 
[ 2073.557693] ret_from_fork (arch/x86/entry/entry_64.S:437) 
[ 2073.558261] Code: 09 84 d2 74 05 e8 84 3a 4b 00 f7 83 84 00 00 00 fd ff ff ff 0f 84 a0 f9 ff ff 0f ff e9 99 f9 ff ff e8 78 18 ff ff e9 c7 fd ff ff <0f> ff e9 ea f9 ff ff e8 67 18 ff ff e9 f5 f9 ff ff e8 fd 85 0f 

All code
========
   0:	09 84 d2 74 05 e8 84 	or     %eax,-0x7b17fa8c(%rdx,%rdx,8)
   7:	3a 4b 00             	cmp    0x0(%rbx),%cl
   a:	f7 83 84 00 00 00 fd 	testl  $0xfffffffd,0x84(%rbx)
  11:	ff ff ff 
  14:	0f 84 a0 f9 ff ff    	je     0xfffffffffffff9ba
  1a:	0f ff                	(bad)  
  1c:	e9 99 f9 ff ff       	jmpq   0xfffffffffffff9ba
  21:	e8 78 18 ff ff       	callq  0xffffffffffff189e
  26:	e9 c7 fd ff ff       	jmpq   0xfffffffffffffdf2
  2b:*	0f ff                	(bad)  		<-- trapping instruction
  2d:	e9 ea f9 ff ff       	jmpq   0xfffffffffffffa1c
  32:	e8 67 18 ff ff       	callq  0xffffffffffff189e
  37:	e9 f5 f9 ff ff       	jmpq   0xfffffffffffffa31
  3c:	e8 fd 85 0f 00       	callq  0xf863e

Code starting with the faulting instruction
===========================================
   0:	0f ff                	(bad)  
   2:	e9 ea f9 ff ff       	jmpq   0xfffffffffffff9f1
   7:	e8 67 18 ff ff       	callq  0xffffffffffff1873
   c:	e9 f5 f9 ff ff       	jmpq   0xfffffffffffffa06
  11:	e8 fd 85 0f 00       	callq  0xf8613
[ 2073.561289] ---[ end trace 88455b894f38cbc5 ]---

And quite a few lines of your added trace (lmk if you need more, or all):

<idle>-0     [000] ..s2   104.829161: load_balance: dst_cpu: 6 cpus: 0-1,3-4,6-7,9
<idle>-0     [000] ..s1   104.829172: rebalance_domains: rcu-read-unlock: 6
<idle>-0     [000] ..s1   104.829177: rebalance_domains: rcu-read-lock: 0
<idle>-0     [000] ..s1   104.829179: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
<idle>-0     [000] ..s1   104.829183: rebalance_domains: rcu-read-unlock: 0
<idle>-0     [001] .Ns1   104.829313: rebalance_domains: rcu-read-lock: 1
<idle>-0     [001] .Ns1   104.829317: rebalance_domains: rcu-read-unlock: 1
<idle>-0     [006] .Ns1   104.829446: rebalance_domains: rcu-read-lock: 6
<idle>-0     [006] .Ns1   104.829450: rebalance_domains: rcu-read-unlock: 6
trinity-c28-3303  [006] d..1   104.829663: pick_next_task_fair: rcu-read-lock: 6
trinity-c28-3303  [006] d..1   104.829667: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c28-3303  [006] d..1   104.829671: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c28-3303  [006] d..1   104.829702: pick_next_task_fair: rcu-read-unlock: 6
kdevtmpfs-72    [001] d..1   104.830079: pick_next_task_fair: rcu-read-lock: 1
kdevtmpfs-72    [001] d..1   104.830083: load_balance: dst_cpu: 1 cpus: 1,9
kdevtmpfs-72    [001] d..1   104.830087: load_balance: dst_cpu: 1 cpus: 1,9
kdevtmpfs-72    [001] d..1   104.830097: pick_next_task_fair: rcu-read-unlock: 1
trinity-c37-3313  [000] d..1   104.830218: pick_next_task_fair: rcu-read-lock: 0
trinity-c37-3313  [000] d..1   104.830221: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c37-3313  [000] d..1   104.830231: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c37-3313  [000] d..1   104.830237: pick_next_task_fair: rcu-read-unlock: 0
trinity-c28-3303  [006] d..1   104.830912: pick_next_task_fair: rcu-read-lock: 6
trinity-c28-3303  [006] d..1   104.830916: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c28-3303  [006] d..1   104.830919: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c28-3303  [006] d..1   104.830950: pick_next_task_fair: rcu-read-unlock: 6
kworker/u26:2-413   [005] ..s1   104.831066: rebalance_domains: rcu-read-lock: 5
kworker/u26:2-413   [005] ..s1   104.831071: rebalance_domains: rcu-read-unlock: 5
rcu_preempt-9     [006] d..1   104.831127: pick_next_task_fair: rcu-read-lock: 6
rcu_preempt-9     [006] d..1   104.831130: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
rcu_preempt-9     [006] d..1   104.831133: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
rcu_preempt-9     [006] d..1   104.831138: pick_next_task_fair: rcu-read-unlock: 6
ksoftirqd/6-51    [006] d..1   104.831284: pick_next_task_fair: rcu-read-lock: 6
ksoftirqd/6-51    [006] d..1   104.831287: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
ksoftirqd/6-51    [006] d..1   104.831290: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
ksoftirqd/6-51    [006] d..1   104.831307: pick_next_task_fair: rcu-read-unlock: 6
kworker/u26:2-5519  [005] d..1   104.831458: pick_next_task_fair: rcu-read-lock: 5
kworker/u26:2-5519  [005] d..1   104.831462: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.831473: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.831479: pick_next_task_fair: rcu-read-unlock: 5
kworker/u26:2-5519  [005] d..1   104.831796: pick_next_task_fair: rcu-read-lock: 5
kworker/u26:2-5519  [005] d..1   104.831799: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] dN.1   104.831823: pick_next_task_fair: rcu-read-unlock: 5
ksoftirqd/1-18    [001] d..1   104.831914: pick_next_task_fair: rcu-read-lock: 1
ksoftirqd/1-18    [001] d..1   104.831918: load_balance: dst_cpu: 1 cpus: 1,9
ksoftirqd/1-18    [001] d..1   104.831928: load_balance: dst_cpu: 1 cpus: 1,9
ksoftirqd/1-18    [001] d..1   104.831934: pick_next_task_fair: rcu-read-unlock: 1
kworker/u26:2-5519  [005] ..s2   104.832087: rebalance_domains: rcu-read-lock: 5
kworker/u26:2-5519  [005] ..s2   104.832091: rebalance_domains: rcu-read-unlock: 5
trinity-c37-3313  [000] ..s2   104.832097: rebalance_domains: rcu-read-lock: 0
trinity-c37-3313  [000] ..s2   104.832101: rebalance_domains: rcu-read-unlock: 0
trinity-c37-3313  [000] d..1   104.832188: pick_next_task_fair: rcu-read-lock: 0
trinity-c37-3313  [000] d..1   104.832190: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c37-3313  [000] d..1   104.832195: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c37-3313  [000] d..1   104.832206: pick_next_task_fair: rcu-read-unlock: 0
kworker/u26:2-5519  [005] d..1   104.832226: pick_next_task_fair: rcu-read-lock: 5
kworker/u26:2-5519  [005] d..1   104.832230: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.832239: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.832244: pick_next_task_fair: rcu-read-unlock: 5
trinity-c12-5517  [008] d..1   104.832468: pick_next_task_fair: rcu-read-lock: 8
trinity-c12-5517  [008] d..1   104.832472: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
trinity-c12-5517  [008] d..1   104.832479: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
kworker/u26:2-5519  [005] d..1   104.832499: pick_next_task_fair: rcu-read-lock: 5
kworker/u26:2-5519  [005] d..1   104.832502: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.832506: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.832520: pick_next_task_fair: rcu-read-unlock: 5
trinity-c28-3303  [006] d..1   104.832533: pick_next_task_fair: rcu-read-lock: 6
trinity-c28-3303  [006] d..1   104.832537: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c12-5517  [008] d..1   104.832537: pick_next_task_fair: rcu-read-unlock: 8
trinity-c28-3303  [006] d..1   104.832540: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c28-3303  [006] d..1   104.832550: pick_next_task_fair: rcu-read-unlock: 6
ksoftirqd/8-63    [008] d..1   104.832790: pick_next_task_fair: rcu-read-lock: 8
ksoftirqd/8-63    [008] d..1   104.832793: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
ksoftirqd/8-63    [008] d..1   104.832797: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
ksoftirqd/8-63    [008] d..1   104.832809: pick_next_task_fair: rcu-read-unlock: 8
<idle>-0     [005] .Ns1   104.833083: rebalance_domains: rcu-read-lock: 5
<idle>-0     [005] .Ns1   104.833087: rebalance_domains: rcu-read-unlock: 5
trinity-c78-4927  [000] d..1   104.833225: pick_next_task_fair: rcu-read-lock: 0
trinity-c78-4927  [000] d..1   104.833228: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c78-4927  [000] d..1   104.833239: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c78-4927  [000] d..1   104.833260: pick_next_task_fair: rcu-read-unlock: 0
kworker/u26:2-5519  [005] d..1   104.833545: pick_next_task_fair: rcu-read-lock: 5
kworker/u26:2-5519  [005] d..1   104.833548: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.833785: load_balance: dst_cpu: 5 cpus: 1,9
kworker/u26:2-5519  [005] d..1   104.833799: pick_next_task_fair: rcu-read-unlock: 5
trinity-c28-3303  [006] ..s1   104.834090: rebalance_domains: rcu-read-lock: 6
trinity-c28-3303  [006] ..s1   104.834094: rebalance_domains: rcu-read-unlock: 6
rcu_preempt-9     [006] d..1   104.834193: pick_next_task_fair: rcu-read-lock: 6
rcu_preempt-9     [006] d..1   104.834196: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
rcu_preempt-9     [006] d..1   104.834200: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
rcu_preempt-9     [006] d..1   104.834210: pick_next_task_fair: rcu-read-unlock: 6
ksoftirqd/6-51    [006] d..1   104.834309: pick_next_task_fair: rcu-read-lock: 6
ksoftirqd/6-51    [006] d..1   104.834312: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
ksoftirqd/6-51    [006] d..1   104.834315: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
ksoftirqd/6-51    [006] d..1   104.834323: pick_next_task_fair: rcu-read-unlock: 6
<idle>-0     [000] .Ns1   104.834449: rebalance_domains: rcu-read-lock: 0
<idle>-0     [000] .Ns1   104.834452: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
<idle>-0     [000] .Ns2   104.834457: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
<idle>-0     [000] .Ns1   104.834459: rebalance_domains: rcu-read-unlock: 0
trinity-c37-3313  [000] d..1   104.834761: pick_next_task_fair: rcu-read-lock: 0
trinity-c37-3313  [000] d..1   104.834764: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c37-3313  [000] d..1   104.834769: load_balance: dst_cpu: 0 cpus: 0-1,3-4,6-7,9
trinity-c37-3313  [000] d..1   104.834780: pick_next_task_fair: rcu-read-unlock: 0
trinity-c79-3355  [005] .Ns1   104.835721: rebalance_domains: rcu-read-lock: 5
trinity-c79-3355  [005] .Ns1   104.835725: rebalance_domains: rcu-read-unlock: 5
<idle>-0     [008] .Ns1   104.835746: rebalance_domains: rcu-read-lock: 8
<idle>-0     [008] .Ns1   104.835751: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
<idle>-0     [008] .Ns2   104.835759: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
<idle>-0     [008] .Ns1   104.835761: rebalance_domains: rcu-read-unlock: 8
trinity-c28-3303  [006] d..1   104.835817: pick_next_task_fair: rcu-read-lock: 6
trinity-c28-3303  [006] d..1   104.835820: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c28-3303  [006] d..1   104.835824: load_balance: dst_cpu: 6 cpus: 1,3-4,6-7,9
trinity-c28-3303  [006] d..1   104.835829: pick_next_task_fair: rcu-read-unlock: 6
rcu_sched-10    [008] d..1   104.835871: pick_next_task_fair: rcu-read-lock: 8
rcu_sched-10    [008] d..1   104.835874: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
rcu_sched-10    [008] d..1   104.835879: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
rcu_sched-10    [008] d..1   104.835915: pick_next_task_fair: rcu-read-unlock: 8
trinity-c63-3339  [008] d..1   104.836114: pick_next_task_fair: rcu-read-lock: 8
trinity-c63-3339  [008] d..1   104.836117: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
trinity-c63-3339  [008] d..1   104.836122: load_balance: dst_cpu: 8 cpus: 0-1,3-4,6-7,9
trinity-c63-3339  [008] d..1   104.836133: pick_next_task_fair: rcu-read-unlock: 8
ksoftirqd/1-18    [001] ..s.   104.836352: rebalance_domains: rcu-read-lock: 1
ksoftirqd/1-18    [001] ..s.   104.836356: rebalance_domains: rcu-read-unlock: 1
<idle>-0     [000] .Ns1   104.836602: rebalance_domains: rcu-read-lock: 0
<idle>-0     [000] .Ns1   104.836606: rebalance_domains: rcu-read-unlock: 0
migration/0-12    [000] ..s1   104.837059: rebalance_domains: rcu-read-lock: 0
migration/0-12    [000] ..s1   104.837063: rebalance_domains: rcu-read-unlock: 0
migration/6-50    [006] ..s2   104.837087: rebalance_domains: rcu-read-lock: 6
migration/6-50    [006] ..s2   104.837092: rebalance_domains: rcu-read-unlock: 6
<idle>-0     [001] .Ns1   104.843088: rebalance_domains: rcu-read-lock: 1
<idle>-0     [001] .Ns1   104.843093: rebalance_domains: rcu-read-unlock: 1
<idle>-0     [008] .Ns1   104.843269: rebalance_domains: rcu-read-lock: 8
<idle>-0     [008] .Ns1   104.843273: rebalance_domains: rcu-read-unlock: 8
migration/2-24    [002] ..s1   104.855994: rebalance_domains: rcu-read-lock: 2
migration/2-24    [002] ..s1   104.856000: load_balance: dst_cpu: 2 cpus: 6

-- 

Thanks,
Sasha

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ