lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220204232406.814-9-paulmck@kernel.org>
Date:   Fri,  4 Feb 2022 15:24:05 -0800
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     rcu@...r.kernel.org
Cc:     linux-kernel@...r.kernel.org, kernel-team@...com,
        rostedt@...dmis.org, "Paul E. McKenney" <paulmck@...nel.org>,
        Tim Murray <timmurray@...gle.com>,
        Joel Fernandes <joelaf@...gle.com>,
        Neeraj Upadhyay <quic_neeraju@...cinc.com>,
        Uladzislau Rezki <urezki@...il.com>,
        Todd Kjos <tkjos@...gle.com>,
        Sandeep Patil <sspatil@...gle.com>, stable@...r.kernel.org
Subject: [PATCH rcu 09/10] rcu: Don't deboost before reporting expedited quiescent state

Currently rcu_preempt_deferred_qs_irqrestore() releases rnp->boost_mtx
before reporting the expedited quiescent state.  Under heavy real-time
load, this can result in this function being preempted before the
quiescent state is reported, which can in turn prevent the expedited grace
period from completing.  Tim Murray reports that the resulting expedited
grace periods can take hundreds of milliseconds and even more than one
second, when they should normally complete in less than a millisecond.

This was fine given that there were no particular response-time
constraints for synchronize_rcu_expedited(), as it was designed
for throughput rather than latency.  However, some users now need
sub-100-millisecond response-time constratints.

This patch therefore follows Neeraj's suggestion (seconded by Tim and
by Uladzislau Rezki) of simply reversing the two operations.

Reported-by: Tim Murray <timmurray@...gle.com>
Reported-by: Joel Fernandes <joelaf@...gle.com>
Reported-by: Neeraj Upadhyay <quic_neeraju@...cinc.com>
Reviewed-by: Neeraj Upadhyay <quic_neeraju@...cinc.com>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@...il.com>
Tested-by: Tim Murray <timmurray@...gle.com>
Cc: Todd Kjos <tkjos@...gle.com>
Cc: Sandeep Patil <sspatil@...gle.com>
Cc: <stable@...r.kernel.org> # 5.4.x
Signed-off-by: Paul E. McKenney <paulmck@...nel.org>
---
 kernel/rcu/tree_plugin.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 109429e70a642..02ac057ba3f83 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -556,16 +556,16 @@ rcu_preempt_deferred_qs_irqrestore(struct task_struct *t, unsigned long flags)
 			raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 		}
 
-		/* Unboost if we were boosted. */
-		if (IS_ENABLED(CONFIG_RCU_BOOST) && drop_boost_mutex)
-			rt_mutex_futex_unlock(&rnp->boost_mtx.rtmutex);
-
 		/*
 		 * If this was the last task on the expedited lists,
 		 * then we need to report up the rcu_node hierarchy.
 		 */
 		if (!empty_exp && empty_exp_now)
 			rcu_report_exp_rnp(rnp, true);
+
+		/* Unboost if we were boosted. */
+		if (IS_ENABLED(CONFIG_RCU_BOOST) && drop_boost_mutex)
+			rt_mutex_futex_unlock(&rnp->boost_mtx.rtmutex);
 	} else {
 		local_irq_restore(flags);
 	}
-- 
2.31.1.189.g2e36527f23

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ