linux-kernel - [PATCH v2] rcu: Speed up calling of RCU tasks callbacks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180521004324.68371-1-joel@joelfernandes.org>
Date:   Sun, 20 May 2018 17:43:24 -0700
From:   Joel Fernandes <joelaf@...gle.com>
To:     linux-kernel@...r.kernel.org
Cc:     "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Peter Zilstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>,
        byungchul.park@....com, kernel-team@...roid.com,
        Josh Triplett <josh@...htriplett.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Lai Jiangshan <jiangshanlai@...il.com>
Subject: [PATCH v2] rcu: Speed up calling of RCU tasks callbacks

From: "Joel Fernandes (Google)" <joel@...lfernandes.org>

RCU tasks callbacks can take atleast 1 second before the callbacks are
executed. This happens even if the hold-out tasks enter their quiescent states
quickly. I noticed this when I was testing trampoline callback execution.

To test the trampoline freeing, I wrote a simple script:
cd /sys/kernel/debug/tracing/
echo '__schedule_bug:traceon' > set_ftrace_filter;
echo '!__schedule_bug:traceon' > set_ftrace_filter;

In the background I had simple bash while loop:
while [ 1 ]; do x=1; done &

Total time of completion of above commands in seconds:

With this patch:
real    0m0.179s
user    0m0.000s
sys     0m0.054s

Without this patch:
real    0m1.098s
user    0m0.000s
sys     0m0.053s

That's a great than 6X speed up in performance. In order to accomplish
this, I am waiting for HZ/10 time before entering the hold-out checking
loop. The loop still preserves its checking of held tasks every 1 second
as before, incase this first test doesn't succeed.

Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Peter Zilstra <peterz@...radead.org>
Cc: Ingo Molnar <mingo@...hat.com>
Cc: Boqun Feng <boqun.feng@...il.com>
Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com>
Cc: byungchul.park@....com
Cc: kernel-team@...roid.com
Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
---
Changes since v1->v2:
 - Changed total wait time to HZ/10 instead of 2 jiffies
 - Updated the commands to reproduce issue

 kernel/rcu/update.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 5783bdf86e5a..a28698e44b08 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -743,6 +743,12 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 		 */
 		synchronize_srcu(&tasks_rcu_exit_srcu);
 
+		/*
+		 * Wait a little bit incase held tasks are released
+		 * during their next timer ticks.
+		 */
+		schedule_timeout_interruptible(HZ/10);
+
 		/*
 		 * Each pass through the following loop scans the list
 		 * of holdout tasks, removing any that are no longer
@@ -755,7 +761,6 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 			int rtst;
 			struct task_struct *t1;
 
-			schedule_timeout_interruptible(HZ);
 			rtst = READ_ONCE(rcu_task_stall_timeout);
 			needreport = rtst > 0 &&
 				     time_after(jiffies, lastreport + rtst);
@@ -768,6 +773,11 @@ static int __noreturn rcu_tasks_kthread(void *arg)
 				check_holdout_task(t, needreport, &firstreport);
 				cond_resched();
 			}
+
+			if (list_empty(&rcu_tasks_holdouts))
+				break;
+
+			schedule_timeout_interruptible(HZ);
 		}
 
 		/*
-- 
2.17.0.441.gb46fe60e1d-goog