linux-kernel - [PATCH 5/6] sched/numa: Reset scan rate whenever task moves across nodes

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <1533276841-16341-6-git-send-email-srikar@linux.vnet.ibm.com>
Date:   Fri,  3 Aug 2018 11:44:00 +0530
From:   Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
To:     Ingo Molnar <mingo@...nel.org>,
        Peter Zijlstra <peterz@...radead.org>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Mel Gorman <mgorman@...hsingularity.net>,
        Rik van Riel <riel@...riel.com>,
        Srikar Dronamraju <srikar@...ux.vnet.ibm.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: [PATCH 5/6] sched/numa: Reset scan rate whenever task moves across nodes

Currently task scan rate is reset when numa balancer migrates the task
to a different node. If numa balancer initiates a swap, reset is only
applicable to the task that initiates the swap. Similarly no scan rate
reset is done if the task is migrated across nodes by traditional load
balancer.

Instead move the scan reset to the migrate_task_rq. This ensures the
task moved out of its preferred node, either gets back to its preferred
node quickly or finds a new preferred node. Doing so, would be fair to
all tasks migrating across nodes.

specjbb2005 / bops/JVM / higher bops are better
on 2 Socket/2 Node Intel
JVMS  Prev    Current  %Change
4     210118  208862   -0.597759
1     313171  307007   -1.96825


on 2 Socket/4 Node Power8 (PowerNV)
JVMS  Prev     Current  %Change
8     91027.5  89911.4  -1.22611
1     216460   216176   -0.131202


on 2 Socket/2 Node Power9 (PowerNV)
JVMS  Prev    Current  %Change
4     191918  196078   2.16759
1     207043  214664   3.68088


on 4 Socket/4 Node Power7
JVMS  Prev     Current  %Change
8     58462.1  60719.2  3.86079
1     108334   112615   3.95167


dbench / transactions / higher numbers are better
on 2 Socket/2 Node Intel
count  Min      Max      Avg      Variance  %Change
5      11851.8  11937.3  11890.9  33.5169
5      12511.7  12559.4  12539.5  15.5883   5.45459


on 2 Socket/4 Node Power8 (PowerNV)
count  Min      Max      Avg      Variance  %Change
5      4791     5016.08  4962.55  85.9625
5      4709.28  4979.28  4919.32  105.126   -0.871125


on 2 Socket/2 Node Power9 (PowerNV)
count  Min      Max      Avg     Variance  %Change
5      9353.43  9380.49  9369.6  9.04361
5      9388.38  9406.29  9395.1  5.98959   0.272157


on 4 Socket/4 Node Power7
count  Min      Max      Avg      Variance  %Change
5      149.518  215.412  179.083  21.5903
5      157.71   184.929  174.754  10.7275   -2.41731

Signed-off-by: Srikar Dronamraju <srikar@...ux.vnet.ibm.com>
---
 kernel/sched/fair.c | 19 +++++++++++++------
 1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index a5936ed..4ea0eff 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -1837,12 +1837,6 @@ static int task_numa_migrate(struct task_struct *p)
 	if (env.best_cpu == -1)
 		return -EAGAIN;
 
-	/*
-	 * Reset the scan period if the task is being rescheduled on an
-	 * alternative node to recheck if the tasks is now properly placed.
-	 */
-	p->numa_scan_period = task_scan_start(p);
-
 	best_rq = cpu_rq(env.best_cpu);
 	if (env.best_task == NULL) {
 		ret = migrate_task_to(p, env.best_cpu);
@@ -6361,6 +6355,19 @@ static void migrate_task_rq_fair(struct task_struct *p, int new_cpu __maybe_unus
 
 	/* We have migrated, no longer consider this task hot */
 	p->se.exec_start = 0;
+
+#ifdef CONFIG_NUMA_BALANCING
+	if (!p->mm || (p->flags & PF_EXITING))
+		return;
+
+	if (p->numa_faults) {
+		int src_nid = cpu_to_node(task_cpu(p));
+		int dst_nid = cpu_to_node(new_cpu);
+
+		if (src_nid != dst_nid)
+			p->numa_scan_period = task_scan_start(p);
+	}
+#endif
 }
 
 static void task_dead_fair(struct task_struct *p)
-- 
1.8.3.1