lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 11 Dec 2020 22:22:46 -0000
From:   "tip-bot2 for Thomas Gleixner" <tip-bot2@...utronix.de>
To:     linux-tip-commits@...r.kernel.org
Cc:     Thomas Gleixner <tglx@...utronix.de>,
        Frederic Weisbecker <frederic@...nel.org>, x86@...nel.org,
        linux-kernel@...r.kernel.org
Subject: [tip: timers/core] tick/sched: Make jiffies update quick check more robust

The following commit has been merged into the timers/core branch of tip:

Commit-ID:     aa3b66f401b372598b29421bab4d17b631b92407
Gitweb:        https://git.kernel.org/tip/aa3b66f401b372598b29421bab4d17b631b92407
Author:        Thomas Gleixner <tglx@...utronix.de>
AuthorDate:    Fri, 04 Dec 2020 11:55:19 +01:00
Committer:     Thomas Gleixner <tglx@...utronix.de>
CommitterDate: Fri, 11 Dec 2020 23:19:10 +01:00

tick/sched: Make jiffies update quick check more robust

The quick check in tick_do_update_jiffies64() whether jiffies need to be
updated is not really correct under all circumstances and on all
architectures, especially not on 32bit systems.

The quick check does:

    if (now < READ_ONCE(tick_next_period))
    	return;

and the counterpart in the update is:

    WRITE_ONCE(tick_next_period, next_update_time);

This has two problems:

  1) On weakly ordered architectures there is no guarantee that the stores
     before the WRITE_ONCE() are visible which means that other CPUs can
     operate on a stale jiffies value.

  2) On 32bit the store of tick_next_period which is an u64 is split into
     two 32bit stores. If the first 32bit store advances tick_next_period
     far out and the second 32bit store is delayed (virt, NMI ...) then
     jiffies will become stale until the second 32bit store happens.

Address this by seperating the handling for 32bit and 64bit.

On 64bit problem #1 is addressed by replacing READ_ONCE() / WRITE_ONCE()
with smp_load_acquire() / smp_store_release().

On 32bit problem #2 is addressed by protecting the quick check with the
jiffies sequence counter. The load and stores can be plain because the
sequence count mechanics provides the required barriers already.

Signed-off-by: Thomas Gleixner <tglx@...utronix.de>
Reviewed-by: Frederic Weisbecker <frederic@...nel.org>
Link: https://lore.kernel.org/r/87czzpc02w.fsf@nanos.tec.linutronix.de

---
 kernel/time/tick-sched.c | 74 ++++++++++++++++++++++++---------------
 1 file changed, 47 insertions(+), 27 deletions(-)

diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index cc7cba2..a9e6893 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -57,36 +57,42 @@ static ktime_t last_jiffies_update;
 static void tick_do_update_jiffies64(ktime_t now)
 {
 	unsigned long ticks = 1;
-	ktime_t delta;
+	ktime_t delta, nextp;
 
 	/*
-	 * Do a quick check without holding jiffies_lock. The READ_ONCE()
+	 * 64bit can do a quick check without holding jiffies lock and
+	 * without looking at the sequence count. The smp_load_acquire()
 	 * pairs with the update done later in this function.
 	 *
-	 * This is also an intentional data race which is even safe on
-	 * 32bit in theory. If there is a concurrent update then the check
-	 * might give a random answer. It does not matter because if it
-	 * returns then the concurrent update is already taking care, if it
-	 * falls through then it will pointlessly contend on jiffies_lock.
-	 *
-	 * Though there is one nasty case on 32bit due to store tearing of
-	 * the 64bit value. If the first 32bit store makes the quick check
-	 * return on all other CPUs and the writing CPU context gets
-	 * delayed to complete the second store (scheduled out on virt)
-	 * then jiffies can become stale for up to ~2^32 nanoseconds
-	 * without noticing. After that point all CPUs will wait for
-	 * jiffies lock.
-	 *
-	 * OTOH, this is not any different than the situation with NOHZ=off
-	 * where one CPU is responsible for updating jiffies and
-	 * timekeeping. If that CPU goes out for lunch then all other CPUs
-	 * will operate on stale jiffies until it decides to come back.
+	 * 32bit cannot do that because the store of tick_next_period
+	 * consists of two 32bit stores and the first store could move it
+	 * to a random point in the future.
 	 */
-	if (ktime_before(now, READ_ONCE(tick_next_period)))
-		return;
+	if (IS_ENABLED(CONFIG_64BIT)) {
+		if (ktime_before(now, smp_load_acquire(&tick_next_period)))
+			return;
+	} else {
+		unsigned int seq;
 
-	/* Reevaluate with jiffies_lock held */
+		/*
+		 * Avoid contention on jiffies_lock and protect the quick
+		 * check with the sequence count.
+		 */
+		do {
+			seq = read_seqcount_begin(&jiffies_seq);
+			nextp = tick_next_period;
+		} while (read_seqcount_retry(&jiffies_seq, seq));
+
+		if (ktime_before(now, nextp))
+			return;
+	}
+
+	/* Quick check failed, i.e. update is required. */
 	raw_spin_lock(&jiffies_lock);
+	/*
+	 * Reevaluate with the lock held. Another CPU might have done the
+	 * update already.
+	 */
 	if (ktime_before(now, tick_next_period)) {
 		raw_spin_unlock(&jiffies_lock);
 		return;
@@ -112,11 +118,25 @@ static void tick_do_update_jiffies64(ktime_t now)
 	jiffies_64 += ticks;
 
 	/*
-	 * Keep the tick_next_period variable up to date.  WRITE_ONCE()
-	 * pairs with the READ_ONCE() in the lockless quick check above.
+	 * Keep the tick_next_period variable up to date.
 	 */
-	WRITE_ONCE(tick_next_period,
-		   ktime_add_ns(last_jiffies_update, TICK_NSEC));
+	nextp = ktime_add_ns(last_jiffies_update, TICK_NSEC);
+
+	if (IS_ENABLED(CONFIG_64BIT)) {
+		/*
+		 * Pairs with smp_load_acquire() in the lockless quick
+		 * check above and ensures that the update to jiffies_64 is
+		 * not reordered vs. the store to tick_next_period, neither
+		 * by the compiler nor by the CPU.
+		 */
+		smp_store_release(&tick_next_period, nextp);
+	} else {
+		/*
+		 * A plain store is good enough on 32bit as the quick check
+		 * above is protected by the sequence count.
+		 */
+		tick_next_period = nextp;
+	}
 
 	/*
 	 * Release the sequence count. calc_global_load() below is not

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ