lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 19 Feb 2021 13:00:03 +0000
From:   Valentin Schneider <valentin.schneider@....com>
To:     linux-kernel@...r.kernel.org
Cc:     Qais Yousef <qais.yousef@....com>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Quentin Perret <qperret@...gle.com>,
        Pavan Kondeti <pkondeti@...eaurora.org>,
        Rik van Riel <riel@...riel.com>,
        Lingutla Chandrasekhar <clingutla@...eaurora.org>
Subject: [PATCH v2 7/7] sched/fair: Relax task_hot() for misfit tasks

Consider the following topology:

  DIE [          ]
  MC  [    ][    ]
       0  1  2  3

  capacity_orig_of(x \in {0-1}) < capacity_orig_of(x \in {2-3})

w/ CPUs 2-3 idle and CPUs 0-1 running CPU hogs (util_avg=1024).

When CPU2 goes through load_balance() (via periodic / NOHZ balance), it
should pull one CPU hog from either CPU0 or CPU1 (this is misfit task
upmigration). However, should a e.g. pcpu kworker awake on CPU0 just before
this load_balance() happens and preempt the CPU hog running there, we would
have, for the [0-1] group at CPU2's DIE level:

o sgs->sum_nr_running > sgs->group_weight
o sgs->group_capacity * 100 < sgs->group_util * imbalance_pct

IOW, this group is group_overloaded.

Considering CPU0 is picked by find_busiest_queue(), we would then visit the
preempted CPU hog in detach_tasks(). However, given it has just been
preempted by this pcpu kworker, task_hot() will prevent it from being
detached. We then leave load_balance() without having done anything.

Long story short, preempted misfit tasks are affected by task_hot(), while
currently running misfit tasks are intentionally preempted by the stopper
task to migrate them over to a higher-capacity CPU.

Align detach_tasks() with the active-balance logic and let it pick a
cache-hot misfit task when the destination CPU can provide a capacity
uplift.

Reviewed-by: Qais Yousef <qais.yousef@....com>
Signed-off-by: Valentin Schneider <valentin.schneider@....com>
---
 kernel/sched/fair.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index ee172b384e29..554430fd249c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7448,6 +7448,17 @@ static int task_hot(struct task_struct *p, struct lb_env *env)
 	if (env->sd->flags & SD_SHARE_CPUCAPACITY)
 		return 0;
 
+	/*
+	 * On a (sane) asymmetric CPU capacity system, the increase in compute
+	 * capacity should offset any potential performance hit caused by a
+	 * migration.
+	 */
+	if (sd_has_asym_cpucapacity(env->sd) &&
+	    env->idle != CPU_NOT_IDLE &&
+	    !task_fits_capacity(p, capacity_of(env->src_cpu)) &&
+	    cpu_capacity_greater(env->dst_cpu, env->src_cpu))
+		return 0;
+
 	/*
 	 * Buddy candidates are cache hot:
 	 */
-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ