linux-kernel - [tip: sched/urgent] sched: Fix unreliable rseq cpu

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <159420157146.4006.2039144692145212660.tip-bot2@tip-bot2>
Date:   Wed, 08 Jul 2020 09:46:11 -0000
From:   "tip-bot2 for Mathieu Desnoyers" <tip-bot2@...utronix.de>
To:     linux-tip-commits@...r.kernel.org
Cc:     Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>,
        stable@...r.kernel.org, #@...-bot2.tec.linutronix.de,
        v4.18+@...-bot2.tec.linutronix.de, x86 <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: [tip: sched/urgent] sched: Fix unreliable rseq cpu_id for new tasks

The following commit has been merged into the sched/urgent branch of tip:

Commit-ID:     ce3614daabea8a2d01c1dd17ae41d1ec5e5ae7db
Gitweb:        https://git.kernel.org/tip/ce3614daabea8a2d01c1dd17ae41d1ec5e5ae7db
Author:        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
AuthorDate:    Mon, 06 Jul 2020 16:49:10 -04:00
Committer:     Peter Zijlstra <peterz@...radead.org>
CommitterDate: Wed, 08 Jul 2020 11:38:50 +02:00

sched: Fix unreliable rseq cpu_id for new tasks

While integrating rseq into glibc and replacing glibc's sched_getcpu
implementation with rseq, glibc's tests discovered an issue with
incorrect __rseq_abi.cpu_id field value right after the first time
a newly created process issues sched_setaffinity.

For the records, it triggers after building glibc and running tests, and
then issuing:

  for x in {1..2000} ; do posix/tst-affinity-static  & done

and shows up as:

error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 2, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0
error: Unexpected CPU 138, expected 0

This is caused by the scheduler invoking __set_task_cpu() directly from
sched_fork() and wake_up_new_task(), thus bypassing rseq_migrate() which
is done by set_task_cpu().

Add the missing rseq_migrate() to both functions. The only other direct
use of __set_task_cpu() is done by init_idle(), which does not involve a
user-space task.

Based on my testing with the glibc test-case, just adding rseq_migrate()
to wake_up_new_task() is sufficient to fix the observed issue. Also add
it to sched_fork() to keep things consistent.

The reason why this never triggered so far with the rseq/basic_test
selftest is unclear.

The current use of sched_getcpu(3) does not typically require it to be
always accurate. However, use of the __rseq_abi.cpu_id field within rseq
critical sections requires it to be accurate. If it is not accurate, it
can cause corruption in the per-cpu data targeted by rseq critical
sections in user-space.

Reported-By: Florian Weimer <fweimer@...hat.com>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
Tested-By: Florian Weimer <fweimer@...hat.com>
Cc: stable@...r.kernel.org # v4.18+
Link: https://lkml.kernel.org/r/20200707201505.2632-1-mathieu.desnoyers@efficios.com
---
 kernel/sched/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 950ac45..e15543c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2965,6 +2965,7 @@ int sched_fork(unsigned long clone_flags, struct task_struct *p)
 	 * Silence PROVE_RCU.
 	 */
 	raw_spin_lock_irqsave(&p->pi_lock, flags);
+	rseq_migrate(p);
 	/*
 	 * We're setting the CPU for the first time, we don't migrate,
 	 * so use __set_task_cpu().
@@ -3029,6 +3030,7 @@ void wake_up_new_task(struct task_struct *p)
 	 * as we're not fully set-up yet.
 	 */
 	p->recent_used_cpu = task_cpu(p);
+	rseq_migrate(p);
 	__set_task_cpu(p, select_task_rq(p, task_cpu(p), SD_BALANCE_FORK, 0));
 #endif
 	rq = __task_rq_lock(p, &rf);