lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 20 Jul 2016 21:04:03 -0500 (CDT)
From:	Christoph Lameter <cl@...ux.com>
To:	Chris Metcalf <cmetcalf@...lanox.com>
cc:	Gilad Ben Yossef <giladb@...lanox.com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ingo Molnar <mingo@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Rik van Riel <riel@...hat.com>, Tejun Heo <tj@...nel.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Viresh Kumar <viresh.kumar@...aro.org>,
	Catalin Marinas <catalin.marinas@....com>,
	Will Deacon <will.deacon@....com>,
	Andy Lutomirski <luto@...capital.net>,
	Daniel Lezcano <daniel.lezcano@...aro.org>,
	linux-doc@...r.kernel.org, linux-api@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v13 00/12] support "task_isolation" mode

We are trying to test the patchset on x86 and are getting strange
backtraces and aborts. It seems that the cpu before the cpu we are running
on creates an irq_work event that causes a latency event on the next cpu.

This is weird. Is there a new round robin IPI feature in the kernel that I
am not aware of?

Backtraces from dmesg:

[  956.603223] latencytest/7928: task_isolation mode lost due to irq_work
[  956.610817] cpu 12: irq_work violating task isolation for latencytest/7928 on cpu 13
[  956.619985] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 4.7.0-rc7-stream1 #1
[  956.628765] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 2.0.2 03/15/2016
[  956.637642]  0000000000000086 ce6735c7b39e7b81 ffff88103e783d00 ffffffff8134f6ff
[  956.646739]  ffff88102c50d700 000000000000000d ffff88103e783d28 ffffffff811986f4
[  956.655828]  ffff88102c50d700 ffff88203cf97f80 000000000000000d ffff88103e783d68
[  956.664924] Call Trace:
[  956.667945]  <IRQ>  [<ffffffff8134f6ff>] dump_stack+0x63/0x84
[  956.674740]  [<ffffffff811986f4>] task_isolation_debug_task+0xb4/0xd0
[  956.682229]  [<ffffffff810b4a13>] _task_isolation_debug+0x83/0xc0
[  956.689331]  [<ffffffff81179c0c>] irq_work_queue_on+0x9c/0x120
[  956.696142]  [<ffffffff811075e4>] tick_nohz_full_kick_cpu+0x44/0x50
[  956.703438]  [<ffffffff810b48d9>] wake_up_nohz_cpu+0x99/0x110
[  956.710150]  [<ffffffff810f57e1>] internal_add_timer+0x71/0xb0
[  956.716959]  [<ffffffff810f696b>] add_timer_on+0xbb/0x140
[  956.723283]  [<ffffffff81100ca0>] clocksource_watchdog+0x230/0x300
[  956.730480]  [<ffffffff81100a70>] ? __clocksource_unstable.isra.2+0x40/0x40
[  956.738555]  [<ffffffff810f5615>] call_timer_fn+0x35/0x120
[  956.744973]  [<ffffffff81100a70>] ? __clocksource_unstable.isra.2+0x40/0x40
[  956.753046]  [<ffffffff810f64cc>] run_timer_softirq+0x23c/0x2f0
[  956.759952]  [<ffffffff816d4397>] __do_softirq+0xd7/0x2c5
[  956.766272]  [<ffffffff81091245>] irq_exit+0xf5/0x100
[  956.772209]  [<ffffffff816d41d2>] smp_apic_timer_interrupt+0x42/0x50
[  956.779600]  [<ffffffff816d231c>] apic_timer_interrupt+0x8c/0xa0
[  956.786602]  <EOI>  [<ffffffff81569eb0>] ? poll_idle+0x40/0x80
[  956.793490]  [<ffffffff815697dc>] cpuidle_enter_state+0x9c/0x260
[  956.800498]  [<ffffffff815699d7>] cpuidle_enter+0x17/0x20
[  956.806810]  [<ffffffff810cf497>] cpu_startup_entry+0x2b7/0x3a0
[  956.813717]  [<ffffffff81050e6c>] start_secondary+0x15c/0x1a0
[ 1036.601758] cpu 12: irq_work violating task isolation for latencytest/8447 on cpu 13
[ 1036.610922] CPU: 12 PID: 0 Comm: swapper/12 Not tainted 4.7.0-rc7-stream1 #1
[ 1036.619692] Hardware name: Dell Inc. PowerEdge R630/0CNCJW, BIOS 2.0.2 03/15/2016
[ 1036.628551]  0000000000000086 ce6735c7b39e7b81 ffff88103e783d00 ffffffff8134f6ff
[ 1036.637648]  ffff88102dca0000 000000000000000d ffff88103e783d28 ffffffff811986f4
[ 1036.646741]  ffff88102dca0000 ffff88203cf97f80 000000000000000d ffff88103e783d68
[ 1036.655833] Call Trace:
[ 1036.658852]  <IRQ>  [<ffffffff8134f6ff>] dump_stack+0x63/0x84
[ 1036.665649]  [<ffffffff811986f4>] task_isolation_debug_task+0xb4/0xd0
[ 1036.673136]  [<ffffffff810b4a13>] _task_isolation_debug+0x83/0xc0
[ 1036.680237]  [<ffffffff81179c0c>] irq_work_queue_on+0x9c/0x120
[ 1036.687091]  [<ffffffff811075e4>] tick_nohz_full_kick_cpu+0x44/0x50
[ 1036.694388]  [<ffffffff810b48d9>] wake_up_nohz_cpu+0x99/0x110
[ 1036.701089]  [<ffffffff810f57e1>] internal_add_timer+0x71/0xb0
[ 1036.707896]  [<ffffffff810f696b>] add_timer_on+0xbb/0x140
[ 1036.714210]  [<ffffffff81100ca0>] clocksource_watchdog+0x230/0x300
[ 1036.721411]  [<ffffffff81100a70>] ? __clocksource_unstable.isra.2+0x40/0x40
[ 1036.729478]  [<ffffffff810f5615>] call_timer_fn+0x35/0x120
[ 1036.735899]  [<ffffffff81100a70>] ? __clocksource_unstable.isra.2+0x40/0x40
[ 1036.743970]  [<ffffffff810f64cc>] run_timer_softirq+0x23c/0x2f0
[ 1036.750878]  [<ffffffff816d4397>] __do_softirq+0xd7/0x2c5
[ 1036.757199]  [<ffffffff81091245>] irq_exit+0xf5/0x100
[ 1036.763132]  [<ffffffff816d41d2>] smp_apic_timer_interrupt+0x42/0x50
[ 1036.770520]  [<ffffffff816d231c>] apic_timer_interrupt+0x8c/0xa0
[ 1036.777520]  <EOI>  [<ffffffff81569eb0>] ? poll_idle+0x40/0x80
[ 1036.784410]  [<ffffffff815697dc>] cpuidle_enter_state+0x9c/0x260
[ 1036.791413]  [<ffffffff815699d7>] cpuidle_enter+0x17/0x20
[ 1036.797734]  [<ffffffff810cf497>] cpu_startup_entry+0x2b7/0x3a0
[ 1036.804641]  [<ffffffff81050e6c>] start_secondary+0x15c/0x1a0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ