lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220511211059.GF1790663@paulmck-ThinkPad-P17-Gen-1>
Date:   Wed, 11 May 2022 14:10:59 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     "Uladzislau Rezki (Sony)" <urezki@...il.com>
Cc:     LKML <linux-kernel@...r.kernel.org>, RCU <rcu@...r.kernel.org>,
        Frederic Weisbecker <frederic@...nel.org>,
        Neeraj Upadhyay <neeraj.iitr10@...il.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>
Subject: Re: [PATCH v2 1/1] rcu/nocb: Add an option to ON/OFF an offloading
 from RT context

On Wed, May 11, 2022 at 10:57:03AM +0200, Uladzislau Rezki (Sony) wrote:
> Introduce a RCU_NOCB_CPU_CB_BOOST kernel option. So a user can
> decide if an offloading has to be done in a high-prio context or
> not. Please note an option depends on RCU_NOCB_CPU and RCU_BOOST
> parameters. For CONFIG_PREEMPT_RT kernel both RCU_BOOST and the
> RCU_NOCB_CPU_CB_BOOST are active by default.
> 
> This patch splits the CONFIG_RCU_BOOST config into two peaces:
> a) boosting preempted RCU readers and the kthreads which are
>    directly responsible for driving expedited grace periods
>    forward;
> b) boosting offloading-kthreads in a way that their scheduling
>    class are changed from SCHED_NORMAL to SCHED_FIFO.
> 
> The main reason of such split is, for example on Android there
> are some workloads which require fast expedited grace period to
> be done whereas offloading in RT context can lead to starvation
> and hogging a CPU for a long time what is not acceptable for
> latency sensitive environment. For instance:
> 
> <snip>
> <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270
> <snip>
> 
> invoking 34 619 callbacks will take time thus making other CFS
> tasks waiting in run-queue to be starved due to such behaviour.
> 
> v1 -> v2:
> - fix the comment about the rcuc/rcub/rcuop;
> - check the kthread_prio against zero value;
> - by default the RCU_NOCB_CPU_CB_BOOST is ON for PREEMPT_RT.
> 
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@...il.com>

Very good, thank you!  I have queued this for further review and testing,
with the usual wordsmithing (please check!).

By default, this would go not into the upcoming merge window, but to
the one after that.  Please let me know if you need it in the upcoming
merge window.

							Thanx, Paul

------------------------------------------------------------------------

commit f50467bdfec9c27ae574b8c7916b51abe3c46eae
Author: Uladzislau Rezki (Sony) <urezki@...il.com>
Date:   Wed May 11 10:57:03 2022 +0200

    rcu/nocb: Add option to opt rcuo kthreads out of RT priority
    
    This commit introduces a RCU_NOCB_CPU_CB_BOOST Kconfig option that
    prevents rcuo kthreads from running at real-time priority, even in
    kernels built with RCU_BOOST.  This capability is important to devices
    needing low-latency (as in a few milliseconds) response from expedited
    RCU grace periods, but which are not running a classic real-time workload.
    On such devices, permitting the rcuo kthreads to run at real-time priority
    results in unacceptable latencies imposed on the application tasks,
    which run as SCHED_OTHER.
    
    See for example the following trace output:
    
    <snip>
    <...>-60 [006] d..1 2979.028717: rcu_batch_start: rcu_preempt CBs=34619 bl=270
    <snip>
    
    If that rcuop kthread were permitted to run at real-time SCHED_FIFO
    priority, it would monopolize its CPU for hundreds of milliseconds
    while invoking those 34619 RCU callback functions, which would cause an
    unacceptably long latency spike for many application stacks on Android
    platforms.
    
    However, some existing real-time workloads require that callback
    invocation run at SCHED_FIFO priority, for example, those running on
    systems with heavy SCHED_OTHER background loads.  (It is the real-time
    system's administrator's responsibility to make sure that important
    real-time tasks run at a higher priority than do RCU's kthreads.)
    
    Therefore, this new RCU_NOCB_CPU_CB_BOOST Kconfig option defaults to
    "y" on kernels built with PREEMPT_RT and defaults to "n" otherwise.
    The effect is to preserve current behavior for real-time systems, but for
    other systems to allow expedited RCU grace periods to run with real-time
    priority while continuing to invoke RCU callbacks as SCHED_OTHER.
    
    As you would expect, this RCU_NOCB_CPU_CB_BOOST Kconfig option has no
    effect except on CPUs with offloaded RCU callbacks.
    
    Signed-off-by: Uladzislau Rezki (Sony) <urezki@...il.com>
    Signed-off-by: Paul E. McKenney <paulmck@...nel.org>

diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index 27aab870ae4cf..c05ca52cdf64d 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -275,6 +275,22 @@ config RCU_NOCB_CPU_DEFAULT_ALL
 	  Say Y here if you want offload all CPUs by default on boot.
 	  Say N here if you are unsure.
 
+config RCU_NOCB_CPU_CB_BOOST
+	bool "Offload RCU callback from real-time kthread"
+	depends on RCU_NOCB_CPU && RCU_BOOST
+	default y if PREEMPT_RT
+	help
+	  Use this option to invoke offloaded callbacks as SCHED_FIFO
+	  to avoid starvation by heavy SCHED_OTHER background load.
+	  Of course, running as SCHED_FIFO during callback floods will
+	  cause the rcuo[ps] kthreads to monopolize the CPU for hundreds
+	  of milliseconds or more.  Therefore, when enabling this option,
+	  it is your responsibility to ensure that latency-sensitive
+	  tasks either run with higher priority or run on some other CPU.
+
+	  Say Y here if you want to set RT priority for offloading kthreads.
+	  Say N here if you are building a !PREEMPT_RT kernel and are unsure.
+
 config TASKS_TRACE_RCU_READ_MB
 	bool "Tasks Trace RCU readers use memory barriers in user and idle"
 	depends on RCU_EXPERT && TASKS_TRACE_RCU
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index bcc5876c9753b..222d59299a2af 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -154,7 +154,11 @@ static void sync_sched_exp_online_cleanup(int cpu);
 static void check_cb_ovld_locked(struct rcu_data *rdp, struct rcu_node *rnp);
 static bool rcu_rdp_is_offloaded(struct rcu_data *rdp);
 
-/* rcuc/rcub/rcuop kthread realtime priority */
+/*
+ * rcuc/rcub/rcuop kthread realtime priority. The "rcuop"
+ * real-time priority(enabling/disabling) is controlled by
+ * the extra CONFIG_RCU_NOCB_CPU_CB_BOOST configuration.
+ */
 static int kthread_prio = IS_ENABLED(CONFIG_RCU_BOOST) ? 1 : 0;
 module_param(kthread_prio, int, 0444);
 
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 60cc92cc66552..fa8e4f82e60c0 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1315,8 +1315,9 @@ static void rcu_spawn_cpu_nocb_kthread(int cpu)
 	if (WARN_ONCE(IS_ERR(t), "%s: Could not start rcuo CB kthread, OOM is now expected behavior\n", __func__))
 		goto end;
 
-	if (kthread_prio)
+	if (IS_ENABLED(CONFIG_RCU_NOCB_CPU_CB_BOOST) && kthread_prio)
 		sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
+
 	WRITE_ONCE(rdp->nocb_cb_kthread, t);
 	WRITE_ONCE(rdp->nocb_gp_kthread, rdp_gp->nocb_gp_kthread);
 	return;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ