lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 23 May 2012 17:23:33 +0200
From:	Mike Galbraith <mgalbraith@...e.de>
To:	paulmck@...ux.vnet.ibm.com
Cc:	Thomas Gleixner <tglx@...utronix.de>,
	LKML <linux-kernel@...r.kernel.org>
Subject: [PATCH v3] clockevents: Per cpu tick skew boot option

On Thu, 2012-05-10 at 11:16 -0700, Paul E. McKenney wrote:

> > --- a/Documentation/kernel-parameters.txt
> > +++ b/Documentation/kernel-parameters.txt
> > @@ -2426,6 +2426,15 @@ bytes respectively. Such letter suffixes
> > 
> >  	sched_debug	[KNL] Enables verbose scheduler debug messages.
> > 
> > +	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
> > +			xtime_lock contention on larger systems, and/or RCU lock
> > +			contention on all systems with CONFIG_MAXSMP set.
> 
> Suggest instead:
> 
> 			contention on systems with large CONFIG_RCU_FANOUT
> 			values.
> 
> > +			Format: { "0" | "1" }
> > +			0 -- disable. (may be 1 via CONFIG_CMDLINE="skew_tick=1"
> 
> Suggest simply:
> 
> 			0 -- disable (default for typical kernel builds).
> 
> With these changes:
> 
> Acked-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>

Hunted down round-tuit.

clockevents: Per cpu tick skew boot option

Quoting removal commit af5ab277ded04bd9bc6b048c5a2f0e7d70ef0867
Historically, Linux has tried to make the regular timer tick on the
various CPUs not happen at the same time, to avoid contention on
xtime_lock.
    
Nowadays, with the tickless kernel, this contention no longer happens
since time keeping and updating are done differently. In addition,
this skew is actually hurting power consumption in a measurable way on
many-core systems.
End quote

Problems:
- Contrary to the above, all systems do encounter contention on
  xtime_lock and RCU structure locks when the tick is synchronized.

- Large systems and moderate sized RT systems suffer intolerable
  jitter with the tick synchronized.

- Fully utilized systems reap no power saving benefit, but do
  suffer from synchronized tick lock contention.

- 0209f649 rcu: limit rcu_node leaf-level fanout
  This patch was born to combat lock contention which testing showed
  to have been _induced by_ skew removal.  Skew the tick, contention
  disappeared virtually completely.  Measured latency on 48 core box
  was >330us.  Revert, amd restore skew, it dropped back to ~70us.
  We absorbed a 400% latency increase to combat induced contention.

Let the user decide whether power consumption or jitter is the
more important consideration for their machines.

Signed-off-by: Mike Galbraith <mgalbraith@...e.de>
Acked-by: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>

---
 Documentation/kernel-parameters.txt |    9 +++++++++
 kernel/time/tick-sched.c            |   19 +++++++++++++++++++
 2 files changed, 28 insertions(+)

--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2426,6 +2426,15 @@ bytes respectively. Such letter suffixes
 
 	sched_debug	[KNL] Enables verbose scheduler debug messages.
 
+	skew_tick=	[KNL] Offset the periodic timer tick per cpu to mitigate
+			xtime_lock contention on larger systems, and/or RCU lock
+			contention on systems with large CONFIG_RCU_FANOUT values.
+			Format: { "0" | "1" }
+			0 -- disable (default for typical kernel builds).
+			1 -- enable.
+			Note: increases power consumption, thus should only be
+			enabled if running jitter sensitive (HPC/RT) workloads.
+
 	security=	[SECURITY] Choose a security module to enable at boot.
 			If this boot parameter is not specified, only the first
 			security module asking for security registration will be
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -814,6 +814,8 @@ static enum hrtimer_restart tick_sched_t
 	return HRTIMER_RESTART;
 }
 
+static int sched_skew_tick;
+
 /**
  * tick_setup_sched_timer - setup the tick emulation timer
  */
@@ -831,6 +833,14 @@ void tick_setup_sched_timer(void)
 	/* Get the next period (per cpu) */
 	hrtimer_set_expires(&ts->sched_timer, tick_init_jiffy_update());
 
+	/* Offset the tick to avert xtime_lock contention. */
+	if (sched_skew_tick) {
+		u64 offset = ktime_to_ns(tick_period) >> 1;
+		do_div(offset, num_possible_cpus());
+		offset *= smp_processor_id();
+		hrtimer_add_expires_ns(&ts->sched_timer, offset);
+	}
+
 	for (;;) {
 		hrtimer_forward(&ts->sched_timer, now, tick_period);
 		hrtimer_start_expires(&ts->sched_timer,
@@ -910,3 +920,12 @@ int tick_check_oneshot_change(int allow_
 	tick_nohz_switch_to_nohz();
 	return 0;
 }
+
+static int __init skew_tick(char *str)
+{
+	get_option(&str, &sched_skew_tick);
+
+	return 0;
+}
+early_param("skew_tick", skew_tick);
+


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ