lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20130709012934.GA26058@linux.vnet.ibm.com>
Date:	Mon, 8 Jul 2013 18:29:34 -0700
From:	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To:	linux-kernel@...r.kernel.org
Cc:	mingo@...e.hu, laijs@...fujitsu.com, dipankar@...ibm.com,
	akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
	josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
	peterz@...radead.org, rostedt@...dmis.org, dhowells@...hat.com,
	edumazet@...gle.com, darren@...art.com, fweisbec@...il.com,
	sbw@....edu
Subject: [PATCH RFC nohz_full 0/7] v3 Provide infrastructure for full-system
 idle

Whenever there is at least one non-idle CPU, it is necessary to
periodically update timekeeping information.  Before NO_HZ_FULL, this
updating was carried out by the scheduling-clock tick, which ran on
every non-idle CPU.  With the advent of NO_HZ_FULL, it is possible
to have non-idle CPUs that are not receiving scheduling-clock ticks.
This possibility is handled by assigning a timekeeping CPU that continues
taking scheduling-clock ticks.

Unfortunately, timekeeping CPU continues taking scheduling-clock
interrupts even when all other CPUs are completely idle, which is
not so good for energy efficiency and battery lifetime.  Clearly, it
would be good to turn off the timekeeping CPU's scheduling-clock tick
when all CPUs are completely idle.  This is conceptually simple, but
we also need good performance and scalability on large systems, which
rules out implementations based on frequently updated global counts of
non-idle CPUs as well as implementations that frequently scan all CPUs.
Nevertheless, we need a single global indicator in order to keep the
overhead of checking acceptably low.

The chosen approach is to enforce hysteresis on the non-idle to
full-system-idle transition, with the amount of hysteresis increasing
linearly with the number of CPUs, thus keeping contention acceptably low.
This approach piggybacks on RCU's existing force-quiescent-state scanning
of idle CPUs, which has the advantage of avoiding the scan entirely on
busy systems that have high levels of multiprogramming.  This scan
takes per-CPU idleness information and feeds it into a state machine
that applies the level of hysteresis required to arrive at a single
full-system-idle indicator.

The individual patches are as follows:

1.	Add a CONFIG_NO_HZ_FULL_SYSIDLE Kconfig parameter to enable
	this feature.  Kernels built with CONFIG_NO_HZ_FULL_SYSIDLE=n
	act exactly as they do today.

2.	Add new fields to the rcu_dynticks structure that track CPU-idle
	information.  These fields consider CPUs running usermode to be
	non-idle, in contrast with the existing fields in that structure.

3.	Track per-CPU idle states.

4.	Add full-system idle states and state variables.

5.	Expand force_qs_rnp(), dyntick_save_progress_counter(), and
	rcu_implicit_dynticks_qs() APIs to enable passing full-system
	idle state information.

6.	Add full-system-idle state machine.

7.	Force RCU's grace-period kthreads onto the timekeeping CPU.

Changes since v2:

o	Completed removing NMI support (thanks to Frederic for spotting
	the remaining cruft).

o	Fix a state-machine bug, again spotted by Frederic.  See
	http://lists-archives.com/linux-kernel/27865835-nohz_full-add-full-system-idle-state-machine.html
	for the full details of the bug.

o	Updated commit log and comment as suggested by Josh Triplett.

Changes since v1:

o	Removed NMI support because NMI handlers cannot safely read
	the time anyway (thanks to Thomas Gleixner and Peter Zijlstra).

						Thanx, Paul

------------------------------------------------------------------------

 b/include/linux/rcupdate.h |   18 +
 b/kernel/rcutree.c         |   49 ++++-
 b/kernel/rcutree.h         |   17 +
 b/kernel/rcutree_plugin.h  |  421 ++++++++++++++++++++++++++++++++++++++++++++-
 b/kernel/time/Kconfig      |   23 ++
 5 files changed, 513 insertions(+), 15 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ