lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 11 May 2022 10:57:03 +0200
From:   "Uladzislau Rezki (Sony)" <urezki@...il.com>
To:     LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
        "Paul E . McKenney" <paulmck@...nel.org>
Cc:     Frederic Weisbecker <frederic@...nel.org>,
        Neeraj Upadhyay <neeraj.iitr10@...il.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Uladzislau Rezki <urezki@...il.com>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>
Subject: [PATCH v2 1/1] rcu/nocb: Add an option to ON/OFF an offloading from RT context

Introduce a RCU_NOCB_CPU_CB_BOOST kernel option. So a user can
decide if an offloading has to be done in a high-prio context or
not. Please note an option depends on RCU_NOCB_CPU and RCU_BOOST
parameters. For CONFIG_PREEMPT_RT kernel both RCU_BOOST and the
RCU_NOCB_CPU_CB_BOOST are active by default.

This patch splits the CONFIG_RCU_BOOST config into two peaces:
a) boosting preempted RCU readers and the kthreads which are
   directly responsible for driving expedited grace periods
   forward;
b) boosting offloading-kthreads in a way that their scheduling
   class are changed from SCHED_NORMAL to SCHED_FIFO.

The main reason of such split is, for example on Android there
are some workloads which require fast expedited grace period to
be done whereas offloading in RT context can lead to starvation
and hogging a CPU for a long time what is not acceptable for
latency sensitive environment. For instance:

<snip>
<...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270
<snip>

invoking 34 619 callbacks will take time thus making other CFS
tasks waiting in run-queue to be starved due to such behaviour.

v1 -> v2:
- fix the comment about the rcuc/rcub/rcuop;
- check the kthread_prio against zero value;
- by default the RCU_NOCB_CPU_CB_BOOST is ON for PREEMPT_RT.

Signed-off-by: Uladzislau Rezki (Sony) <urezki@...il.com>
---
 kernel/rcu/Kconfig     | 14 ++++++++++++++
 kernel/rcu/tree.c      |  6 +++++-
 kernel/rcu/tree_nocb.h |  3 ++-
 3 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index 27aab870ae4c..a4ed7b5e2b75 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -275,6 +275,20 @@ config RCU_NOCB_CPU_DEFAULT_ALL
 	  Say Y here if you want offload all CPUs by default on boot.
 	  Say N here if you are unsure.
 
+config RCU_NOCB_CPU_CB_BOOST
+	bool "Offload RCU callback from real-time kthread"
+	depends on RCU_NOCB_CPU && RCU_BOOST
+	default y if PREEMPT_RT
+	help
+	  Use this option to offload callbacks from the SCHED_FIFO context
+	  to make the process faster. As a side effect of this approach is
+	  a latency especially for the SCHED_OTHER tasks which will not be
+	  able to preempt an offloading kthread. That latency depends on a
+	  number of callbacks to be invoked.
+
+	  Say Y here if you want to set RT priority for offloading kthreads.
+	  Say N here if you are unsure.
+
 config TASKS_TRACE_RCU_READ_MB
 	bool "Tasks Trace RCU readers use memory barriers in user and idle"
 	depends on RCU_EXPERT && TASKS_TRACE_RCU
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9dc4c4e82db6..1c3852b1e0c8 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -154,7 +154,11 @@ static void sync_sched_exp_online_cleanup(int cpu);
 static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
 static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
 
-/* rcuc/rcub/rcuop kthread realtime priority */
+/*
+ * rcuc/rcub/rcuop kthread realtime priority. The "rcuop"
+ * real-time priority(enabling/disabling) is controlled by
+ * the extra CONFIG_RCU_NOCB_CPU_CB_BOOST configuration.
+ */
 static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
 module_param(kthread_prio, int, 0444);
 
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 60cc92cc6655..fa8e4f82e60c 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
 	if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
 		goto end;
 
-	if (kthread_prio)
+	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST) && kthread_prio)
 		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+
 	WRITE_ONCE(rdp->nocb_cb_kthread, t);
 	WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
 	return;
-- 
2.30.2

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ