linux-kernel - [RFC PATCH 4/4] sched: Upload nohz full CPU load on task enqueue/dequeue

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1452700891-21807-5-git-send-email-fweisbec@gmail.com>
Date:	Wed, 13 Jan 2016 17:01:31 +0100
From:	Frederic Weisbecker <fweisbec@...il.com>
To:	Peter Zijlstra <peterz@...radead.org>
Cc:	LKML <linux-kernel@...r.kernel.org>,
	Frederic Weisbecker <fweisbec@...il.com>,
	Byungchul Park <byungchul.park@....com>,
	Chris Metcalf <cmetcalf@...hip.com>,
	Thomas Gleixner <tglx@...utronix.de>,
	Luiz Capitulino <lcapitulino@...hat.com>,
	Christoph Lameter <cl@...ux.com>,
	"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
	Mike Galbraith <efault@....de>, Rik van Riel <riel@...hat.com>
Subject: [RFC PATCH 4/4] sched: Upload nohz full CPU load on task enqueue/dequeue

The full nohz CPU load is currently accounted on tick restart only.
But there are a few issues with this model:

_ On tick restart, if cpu_load[0] doesn't contain the load of the actual
  tickless load that just ran, we are going to account a wrong value.
  And it is very likely to be so given that cpu_load[0] doesn't have
  an opportunity to be updated between tick stop and tick restart.

_ If the runqueue had updates that didn't trigger a tick restart, we
  are going to miss those CPU load changes.

A solution to fix this is to update the CPU load everytime we enqueue
or dequeue a task in the fair runqueue and more than a jiffy occured
since the last update.

Cc: Byungchul Park <byungchul.park@....com>
Cc: Mike Galbraith <efault@....de>
Cc: Chris Metcalf <cmetcalf@...hip.com>
Cc: Christoph Lameter <cl@...ux.com>
Cc: Luiz Capitulino <lcapitulino@...hat.com>
Cc: Paul E . McKenney <paulmck@...ux.vnet.ibm.com>
Cc: Rik van Riel <riel@...hat.com>
Cc: Thomas Gleixner <tglx@...utronix.de>
Signed-off-by: Frederic Weisbecker <fweisbec@...il.com>
---
 kernel/sched/fair.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1e0cb6e..763dc3b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4433,6 +4433,34 @@ void update_cpu_load_active(struct rq *this_rq)
 #endif /* CONFIG_SMP */
 
 /*
+ * In NO_HZ full mode, we need to account the CPU load without relying
+ * on the tick. We do it instead on task enqueue/dequeue time as those
+ * are the main points where CPU load changes.
+ */
+static inline void update_cpu_load_nohz_full(struct rq *rq)
+{
+#ifdef CONFIG_NO_HZ_FULL
+        unsigned long curr_jiffies;
+        unsigned long load;
+
+	if (!tick_nohz_full_cpu(cpu_of(rq)))
+		return;
+
+	curr_jiffies = READ_ONCE(jiffies);
+	load = weighted_cpuload(cpu_of(rq));
+	if (curr_jiffies == rq->last_load_update_tick) {
+		/*
+		 * At least record the current load so that we flush
+		 * it correctly on the next update.
+		 */
+		rq->cpu_load[0] = load;
+	} else {
+		__update_cpu_load_nohz(rq, curr_jiffies, load, 1);
+	}
+#endif
+}
+
+/*
  * The enqueue_task method is called before nr_running is
  * increased. Here we update the fair scheduling stats and
  * then put the task into the rbtree:
@@ -4477,6 +4505,7 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 		add_nr_running(rq, 1);
 
 	hrtick_update(rq);
+	update_cpu_load_nohz_full(rq);
 }
 
 static void set_next_buddy(struct sched_entity *se);
@@ -4537,6 +4566,7 @@ static void dequeue_task_fair(struct rq *rq, struct task_struct *p, int flags)
 		sub_nr_running(rq, 1);
 
 	hrtick_update(rq);
+	update_cpu_load_nohz_full(rq);
 }
 
 #ifdef CONFIG_SMP
-- 
2.6.4