lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140719180120.GA20887@localhost.localdomain>
Date:	Sat, 19 Jul 2014 20:01:24 +0200
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Cc:	linux-kernel@...r.kernel.org, mingo@...nel.org,
	laijs@...fujitsu.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, tglx@...utronix.de, peterz@...radead.org,
	rostedt@...dmis.org, dhowells@...hat.com, edumazet@...gle.com,
	dvhart@...ux.intel.com, oleg@...hat.com, bobby.prani@...il.com
Subject: Re: [PATCH tip/core/rcu] Do not keep timekeeping CPU tick running
 for non-nohz_full= CPUs

On Sat, Jul 19, 2014 at 09:53:50AM -0700, Paul E. McKenney wrote:
> If a non-nohz_full= CPU is non-idle, it will have a scheduling-clock
> interrupt, and therefore doesn't need the timekeeping CPU to keep
> its scheduling-clock interrupt going.  This commit therefore ignores
> the idle state of non-nohz_full CPUs when determining whether or not
> the timekeeping CPU can safely turn off its scheduling-clock interrupt.
> 
> Signed-off-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>

Unfortunately that's not how things work. Running a CPU tick doesn't necessarily
imply to run the timekeeping duty.

Only the timekeeper can update the timekeeping. There is an exception though:
the timekeeping is also updated by dynticks idle CPUs when they wake up in an
interrupt from idle.

Here is in practice why it doesn't work:

So lets say CPU 0 is timekeeper, CPU 1 a non-nohz-full CPU and all others are full-nohz.
CPU 0 is sleeping. CPU 1 wakes up from idle, so it has an uptodate timekeeping but then
if it continues to execute further without waking up CPU 0, it risks stale timestamps.

This can be changed by allowing timekeeping duty from all non-nohz_full CPUs, that's
the initial direction I took, but it involved a lot of complications and scalability
issues.

> 
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index ddad959a9132..eaa32e4c228d 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -2789,8 +2789,13 @@ static void rcu_sysidle_exit(struct rcu_dynticks *rdtp, int irq)
>  	 * system-idle state.  This means that the timekeeping CPU must
>  	 * invoke rcu_sysidle_force_exit() directly if it does anything
>  	 * more than take a scheduling-clock interrupt.
> +	 *
> +	 * In addition if we are not a nohz_full= CPU, then when we are
> +	 * non-idle we have our own tick, so we don't need the timekeeping
> +	 * CPU to keep a tick on our behalf.  We assume that the timekeeping
> +	 * CPU is also a nohz_full= CPU.
>  	 */
> -	if (smp_processor_id() == tick_do_timer_cpu)
> +	if (!tick_nohz_full_cpu(smp_processor_id()))
>  		return;
>  
>  	/* Update system-idle state: We are clearly no longer fully idle! */
> @@ -2810,11 +2815,11 @@ static void rcu_sysidle_check_cpu(struct rcu_data *rdp, bool *isidle,
>  
>  	/*
>  	 * If some other CPU has already reported non-idle, if this is
> -	 * not the flavor of RCU that tracks sysidle state, or if this
> -	 * is an offline or the timekeeping CPU, nothing to do.
> +	 * not the flavor of RCU that tracks sysidle state, or if this is
> +	 * an offline or !nohz_full= or the timekeeping CPU, nothing to do.
>  	 */
>  	if (!*isidle || rdp->rsp != rcu_sysidle_state ||
> -	    cpu_is_offline(rdp->cpu) || rdp->cpu == tick_do_timer_cpu)
> +	    cpu_is_offline(rdp->cpu) || !tick_nohz_full_cpu(rdp->cpu))
>  		return;
>  	if (rcu_gp_in_progress(rdp->rsp))
>  		WARN_ON_ONCE(smp_processor_id() != tick_do_timer_cpu);
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ