lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 3 Sep 2015 02:03:51 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	Tejun Heo <tj@...nel.org>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	linux-kernel@...r.kernel.org
Subject: Re: Warning in irq_work_queue_on()

On Thu, Sep 03, 2015 at 12:24:27AM +0200, Peter Zijlstra wrote:
> On Wed, Sep 02, 2015 at 11:50:22PM +0200, Frederic Weisbecker wrote:
> > > > [  875.703227]  [<ffffffff810c2d74>] tick_nohz_full_kick_cpu+0x44/0x50
> > 
> > It happens in nohz full, but I'm not sure the guilty is nohz full.
> > 
> > The problem here is that wake_up_nohz_cpu() selects a CPU that is offline.
> 
> wake_up_nohz_cpu() doesn't do any such thing. Where does the selection
> logic live?

Err, got confused with get_nohz_timer_target(). But yeah wake_up_nohz_cpu() is
called with a CPU that is chosen by mod_timer() -> get_nohz_timer_target().

> 
> > But this shouldn't happen. Either it selects a CPU that is in the domain tree,
> > and I suspect offline CPUs aren't supposed to be there, or it selects the current
> > CPU. And if the CPU is offlined, it shouldn't be running some kthread...
> 
> Do no assume things like that.. always check with the active mask.

Hmm, so perhaps we need something like this (makes me realize that
the is_housekeeping_cpu() passes the wrong argument, no issue in practice
since nohz full aren't in the domain tree but I still need to fix that along).

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0902e4d..2c10a69 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -628,7 +628,7 @@ int get_nohz_timer_target(void)
 
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
-		for_each_cpu(i, sched_domain_span(sd)) {
+		for_each_cpu_and(i, sched_domain_span(sd), cpu_online_mask) {
 			if (!idle_cpu(i) && is_housekeeping_cpu(cpu)) {
 				cpu = i;
 				goto unlock;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ