lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20181012184114.w332lnkc34evd4sm@linutronix.de>
Date:   Fri, 12 Oct 2018 20:41:15 +0200
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     "Paul E. McKenney" <paulmck@...ux.ibm.com>
Cc:     Tejun Heo <tj@...nel.org>, linux-kernel@...r.kernel.org,
        Boqun Feng <boqun.feng@...il.com>,
        Peter Zijlstra <peterz@...radead.org>,
        "Aneesh Kumar K.V" <aneesh.kumar@...ux.vnet.ibm.com>,
        tglx@...utronix.de, Steven Rostedt <rostedt@...dmis.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>
Subject: Re: [PATCH] rcu: Use cpus_read_lock() while looking at
 cpu_online_mask

On 2018-09-19 15:11:40 [-0700], Paul E. McKenney wrote:
> On Wed, Sep 19, 2018 at 01:55:21PM -0700, Tejun Heo wrote:
> > Unbound workqueue is NUMA-affine by default, so using it by default
> > might not harm anything.
> 
> OK, so the above workaround would function correctly on -rt, thank you!
> 
> Sebastian, is there a counterpart to CONFIG_PREEMPT_RT already in
> mainline?  If so, I would be happy to make mainline safe for -rt.

Now that I stumbled upon it again, I noticed that I never replied here.
Sorry for that.

Let me summarize:
sync_rcu_exp_select_cpus() used
	queue_work_on(rnp->grplo, rcu_par_gp_wq, &rnp->rew.rew_work);

which was changed in commit fcc6354365015 ("rcu: Make expedited GPs
handle CPU 0 being offline"). The commit claims that this is needed in
case CPU0 is offline so it tries to find another CPU starting with the
possible offline CPU. It might cross to another NUMA node but that is
not really a problem, it just tries to remain on the "local NUMA node".

After that commit, the code invokes queue_work_on() within a
preempt_disable() section because it can't use cpus_read_lock() to
ensure that the CPU won't go offline in the middle of testing (and
preempt_disable() does the trick).
For RT reasons I would like to get rid of queue_work_on() within the
preempt_disable() section.
Tejun said that enqueueing an item on an unbound workqueue is NUMA
affine.

I figured out that enqueueing an item on an offline CPU is not a problem
and it will pop up on a "random" CPU which means it will be carried out
asap and will not wait until the CPU gets back online. Therefore I don't
understand the commit fcc6354365015.

May I suggest the following change? It will enqueue the work item on
the first CPU on the NUMA node and the "unbound" part of the work queue
ensures that a CPU of that NUMA node will perform the work.
This is mostly a revert of fcc6354365015 except that the workqueue
gained the WQ_UNBOUND flag.

------------------>8----

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 0b760c1369f76..94d6c50c4e796 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -4162,7 +4162,7 @@ void __init rcu_init(void)
 	/* Create workqueue for expedited GPs and for Tree SRCU. */
 	rcu_gp_wq = alloc_workqueue("rcu_gp", WQ_MEM_RECLAIM, 0);
 	WARN_ON(!rcu_gp_wq);
-	rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM, 0);
+	rcu_par_gp_wq = alloc_workqueue("rcu_par_gp", WQ_MEM_RECLAIM | WQ_UNBOUND, 0);
 	WARN_ON(!rcu_par_gp_wq);
 }
 
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index 0b2c2ad69629c..a0486414edb40 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -472,7 +472,6 @@ static void sync_rcu_exp_select_node_cpus(struct work_struct *wp)
 static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
 				     smp_call_func_t func)
 {
-	int cpu;
 	struct rcu_node *rnp;
 
 	trace_rcu_exp_grace_period(rsp->name, rcu_exp_gp_seq_endval(rsp), TPS("reset"));
@@ -494,13 +493,7 @@ static void sync_rcu_exp_select_cpus(struct rcu_state *rsp,
 			continue;
 		}
 		INIT_WORK(&rnp->rew.rew_work, sync_rcu_exp_select_node_cpus);
-		preempt_disable();
-		cpu = cpumask_next(rnp->grplo - 1, cpu_online_mask);
-		/* If all offline, queue the work on an unbound CPU. */
-		if (unlikely(cpu > rnp->grphi))
-			cpu = WORK_CPU_UNBOUND;
-		queue_work_on(cpu, rcu_par_gp_wq, &rnp->rew.rew_work);
-		preempt_enable();
+		queue_work_on(rnp->grplo, rcu_par_gp_wq, &rnp->rew.rew_work);
 		rnp->exp_need_flush = true;
 	}
 

> 
> 							Thanx, Paul

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ