linux-kernel - Re: [PATCH 1/4] rcu: Speed up calling of RCU tasks callbacks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180523155734.GK3803@linux.vnet.ibm.com>
Date:   Wed, 23 May 2018 08:57:34 -0700
From:   "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:     Joel Fernandes <joelaf@...gle.com>
Cc:     linux-kernel@...r.kernel.org,
        "Joel Fernandes (Google)" <joel@...lfernandes.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Peter Zilstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Boqun Feng <boqun.feng@...il.com>, byungchul.park@....com,
        kernel-team@...roid.com, Josh Triplett <josh@...htriplett.org>,
        Lai Jiangshan <jiangshanlai@...il.com>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Subject: Re: [PATCH 1/4] rcu: Speed up calling of RCU tasks callbacks

On Tue, May 22, 2018 at 11:38:12PM -0700, Joel Fernandes wrote:
> From: "Joel Fernandes (Google)" <joel@...lfernandes.org>
> 
> RCU tasks callbacks can take at least 1 second before the callbacks are
> executed. This happens even if the hold-out tasks enter their quiescent states
> quickly. I noticed this when I was testing trampoline callback execution.
> 
> To test the trampoline freeing, I wrote a simple script:
> cd /sys/kernel/debug/tracing/
> echo '__schedule_bug:traceon' > set_ftrace_filter;
> echo '!__schedule_bug:traceon' > set_ftrace_filter;
> 
> In the background I had simple bash while loop:
> while [ 1 ]; do x=1; done &
> 
> Total time of completion of above commands in seconds:
> 
> With this patch:
> real    0m0.179s
> user    0m0.000s
> sys     0m0.054s
> 
> Without this patch:
> real    0m1.098s
> user    0m0.000s
> sys     0m0.053s
> 
> That's a greater than 6X speed up in performance. In order to accomplish
> this, I am waiting for HZ/10 time before entering the hold-out checking
> loop. The loop still preserves its checking of held tasks every 1 second
> as before, in case this first test doesn't succeed.
> 
> Cc: Steven Rostedt <rostedt@...dmis.org>

Given an ack from Steven, I would be happy to take this, give or take
some nits below.

							Thanx, Paul

> Cc: Peter Zilstra <peterz@...radead.org>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Boqun Feng <boqun.feng@...il.com>
> Cc: Paul McKenney <paulmck@...ux.vnet.ibm.com>
> Cc: byungchul.park@....com
> Cc: kernel-team@...roid.com
> Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
> ---
>  kernel/rcu/update.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
> index 5783bdf86e5a..a28698e44b08 100644
> --- a/kernel/rcu/update.c
> +++ b/kernel/rcu/update.c
> @@ -743,6 +743,12 @@ static int __noreturn rcu_tasks_kthread(void *arg)
>  		 */
>  		synchronize_srcu(&tasks_rcu_exit_srcu);
> 
> +		/*
> +		 * Wait a little bit incase held tasks are released

				in case

> +		 * during their next timer ticks.
> +		 */
> +		schedule_timeout_interruptible(HZ/10);
> +
>  		/*
>  		 * Each pass through the following loop scans the list
>  		 * of holdout tasks, removing any that are no longer
> @@ -755,7 +761,6 @@ static int __noreturn rcu_tasks_kthread(void *arg)
>  			int rtst;
>  			struct task_struct *t1;
> 
> -			schedule_timeout_interruptible(HZ);
>  			rtst = READ_ONCE(rcu_task_stall_timeout);
>  			needreport = rtst > 0 &&
>  				     time_after(jiffies, lastreport + rtst);
> @@ -768,6 +773,11 @@ static int __noreturn rcu_tasks_kthread(void *arg)
>  				check_holdout_task(t, needreport, &firstreport);
>  				cond_resched();
>  			}
> +
> +			if (list_empty(&rcu_tasks_holdouts))
> +				break;
> +
> +			schedule_timeout_interruptible(HZ);

Is there a better way to do this?  Can this be converted into a for-loop?
Alternatively, would it make sense to have a firsttime local variable
initialized to true, to keep the schedule_timeout_interruptible() at
the beginning of the loop, but skip it on the first pass through the loop?

Don't get me wrong, what you have looks functionally correct, but
duplicating the condition might cause problems later on, for example,
should a bug fix be needed in the condition.

>  		}
> 
>  		/*
> -- 
> 2.17.0.441.gb46fe60e1d-goog
>