linux-kernel - [PATCH v7 16/22] sched: Defer wakeup in ttwu() for unschedulable frozen tasks

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20210525151432.16875-17-will@kernel.org>
Date:   Tue, 25 May 2021 16:14:26 +0100
From:   Will Deacon <will@...nel.org>
To:     linux-arm-kernel@...ts.infradead.org
Cc:     linux-arch@...r.kernel.org, linux-kernel@...r.kernel.org,
        Will Deacon <will@...nel.org>,
        Catalin Marinas <catalin.marinas@....com>,
        Marc Zyngier <maz@...nel.org>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Morten Rasmussen <morten.rasmussen@....com>,
        Qais Yousef <qais.yousef@....com>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Quentin Perret <qperret@...gle.com>, Tejun Heo <tj@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Ingo Molnar <mingo@...hat.com>,
        Juri Lelli <juri.lelli@...hat.com>,
        Vincent Guittot <vincent.guittot@...aro.org>,
        "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>,
        kernel-team@...roid.com
Subject: [PATCH v7 16/22] sched: Defer wakeup in ttwu() for unschedulable frozen tasks

Asymmetric systems may not offer the same level of userspace ISA support
across all CPUs, meaning that some applications cannot be executed by
some CPUs. As a concrete example, upcoming arm64 big.LITTLE designs do
not feature support for 32-bit applications on both clusters.

Although we take care to prevent explicit hot-unplug of all 32-bit
capable CPUs on such a system, this is required when suspending on some
SoCs where the firmware mandates that the suspend/resume operation is
handled by CPU 0, which may not be capable of running 32-bit tasks.

Consequently, there is a window on the resume path where no 32-bit
capable CPUs are available for scheduling and waking up a 32-bit task
will result in a scheduler BUG() due to failure of select_fallback_rq():

  | kernel BUG at kernel/sched/core.c:2858!
  | Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
  | ...
  | Call trace:
  |  select_fallback_rq+0x4b0/0x4e4
  |  try_to_wake_up.llvm.4388853297126348405+0x460/0x5b0
  |  default_wake_function+0x1c/0x30
  |  autoremove_wake_function+0x1c/0x60
  |  __wake_up_common.llvm.11763074518265335900+0x100/0x1b8
  |  __wake_up+0x78/0xc4
  |  ep_poll_callback+0x20c/0x3fc

Prevent wakeups of unschedulable frozen tasks in ttwu() and instead
defer the wakeup to __thaw_tasks(), which runs only once all the
secondary CPUs are back online.

Signed-off-by: Will Deacon <will@...nel.org>
---
 kernel/freezer.c    | 10 +++++++++-
 kernel/sched/core.c | 13 +++++++++++++
 2 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/kernel/freezer.c b/kernel/freezer.c
index dc520f01f99d..8f3d950c2a87 100644
--- a/kernel/freezer.c
+++ b/kernel/freezer.c
@@ -11,6 +11,7 @@
 #include <linux/syscalls.h>
 #include <linux/freezer.h>
 #include <linux/kthread.h>
+#include <linux/mmu_context.h>
 
 /* total number of freezing conditions in effect */
 atomic_t system_freezing_cnt = ATOMIC_INIT(0);
@@ -146,9 +147,16 @@ bool freeze_task(struct task_struct *p)
 void __thaw_task(struct task_struct *p)
 {
 	unsigned long flags;
+	const struct cpumask *mask = task_cpu_possible_mask(p);
 
 	spin_lock_irqsave(&freezer_lock, flags);
-	if (frozen(p))
+	/*
+	 * Wake up frozen tasks. On asymmetric systems where tasks cannot
+	 * run on all CPUs, ttwu() may have deferred a wakeup generated
+	 * before thaw_secondary_cpus() had completed so we generate
+	 * additional wakeups here for tasks in the PF_FREEZER_SKIP state.
+	 */
+	if (frozen(p) || (frozen_or_skipped(p) && mask != cpu_possible_mask))
 		wake_up_process(p);
 	spin_unlock_irqrestore(&freezer_lock, flags);
 }
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 42e2aecf087c..6cb9677d635a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3529,6 +3529,19 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 	if (!(p->state & state))
 		goto unlock;
 
+#ifdef CONFIG_FREEZER
+	/*
+	 * If we're going to wake up a thread which may be frozen, then
+	 * we can only do so if we have an active CPU which is capable of
+	 * running it. This may not be the case when resuming from suspend,
+	 * as the secondary CPUs may not yet be back online. See __thaw_task()
+	 * for the actual wakeup.
+	 */
+	if (unlikely(frozen_or_skipped(p)) &&
+	    !cpumask_intersects(cpu_active_mask, task_cpu_possible_mask(p)))
+		goto unlock;
+#endif
+
 	trace_sched_waking(p);
 
 	/* We're going to change ->state: */
-- 
2.31.1.818.g46aad6cb9e-goog