[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220610173353.GI1790663@paulmck-ThinkPad-P17-Gen-1>
Date: Fri, 10 Jun 2022 10:33:53 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Frederic Weisbecker <frederic@...nel.org>
Cc: LKML <linux-kernel@...r.kernel.org>, rcu@...r.kernel.org
Subject: Re: [PATCH] rcutorture: Fix ksoftirqd boosting timing and iteration
On Fri, Jun 10, 2022 at 03:03:57PM +0200, Frederic Weisbecker wrote:
> The RCU priority boosting can fail on two situations:
>
> 1) If (nr_cpus= > maxcpus=), which means if the total number of CPUs
> if higher than those brought online on boot, then torture_onoff() may
> later bring up CPUs that weren't online on boot. Now since rcutorture
> initialization only boosts the ksoftirqds of the CPUs that have been
> set online on boot, the CPUs later set online by torture_onoff won't
> benefit from the boost, making RCU priority boosting fail.
>
> 2) Ksoftirqds kthreads are boosted after the creation of
> rcu_torture_boost() kthreads, which opens a window large enough for them
> to stutter in low FIFO mode while waiting for ksoftirqds that are still
> in SCHED_NORMAL mode.
>
> The issues can trigger for example with:
>
> ./kvm.sh --configs TREE01 --kconfig "CONFIG_RCU_BOOST=y"
>
> [ 34.968561] rcu-torture: !!!
> [ 34.968627] ------------[ cut here ]------------
> [ 35.014054] WARNING: CPU: 4 PID: 114 at kernel/rcu/rcutorture.c:1979 rcu_torture_stats_print+0x5ad/0x610
> [ 35.052043] Modules linked in:
> [ 35.069138] CPU: 4 PID: 114 Comm: rcu_torture_sta Not tainted 5.18.0-rc1 #1
> [ 35.096424] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014
> [ 35.154570] RIP: 0010:rcu_torture_stats_print+0x5ad/0x610
> [ 35.198527] Code: 63 1b 02 00 74 02 0f 0b 48 83 3d 35 63 1b 02 00 74 02 0f 0b 48 83 3d 21 63 1b 02 00 74 02 0f 0b 48 83 3d 0d 63 1b 02 00 74 02 <0f> 0b 83 eb 01 0f 8e ba fc ff ff 0f 0b e9 b3 fc ff f82
> [ 37.251049] RSP: 0000:ffffa92a0050bdf8 EFLAGS: 00010202
> [ 37.277320] rcu: De-offloading 8
> [ 37.290367] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000001
> [ 37.290387] RDX: 0000000000000000 RSI: 00000000ffffbfff RDI: 00000000ffffffff
> [ 37.290398] RBP: 000000000000007b R08: 0000000000000000 R09: c0000000ffffbfff
> [ 37.290407] R10: 000000000000002a R11: ffffa92a0050bc18 R12: ffffa92a0050be20
> [ 37.290417] R13: ffffa92a0050be78 R14: 0000000000000000 R15: 000000000001bea0
> [ 37.290427] FS: 0000000000000000(0000) GS:ffff96045eb00000(0000) knlGS:0000000000000000
> [ 37.290448] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 37.290460] CR2: 0000000000000000 CR3: 000000001dc0c000 CR4: 00000000000006e0
> [ 37.290470] Call Trace:
> [ 37.295049] <TASK>
> [ 37.295065] ? preempt_count_add+0x63/0x90
> [ 37.295095] ? _raw_spin_lock_irqsave+0x12/0x40
> [ 37.295125] ? rcu_torture_stats_print+0x610/0x610
> [ 37.295143] rcu_torture_stats+0x29/0x70
> [ 37.295160] kthread+0xe3/0x110
> [ 37.295176] ? kthread_complete_and_exit+0x20/0x20
> [ 37.295193] ret_from_fork+0x22/0x30
> [ 37.295218] </TASK>
>
> Fix this with boosting the ksoftirqds kthreads from the boosting
> hotplug callback itself and before the boosting kthreads are created.
>
> Fixes: ea6d962e80b6 ("rcutorture: Judge RCU priority boosting on grace periods, not callbacks")
> Signed-off-by: Frederic Weisbecker <frederic@...nel.org>
Good catch! Queued for testing and review, thank you!
Thanx, Paul
> ---
> kernel/rcu/rcutorture.c | 28 +++++++++++++---------------
> 1 file changed, 13 insertions(+), 15 deletions(-)
>
> diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> index abb3f6d720f1..21470ebb15eb 100644
> --- a/kernel/rcu/rcutorture.c
> +++ b/kernel/rcu/rcutorture.c
> @@ -2136,6 +2136,19 @@ static int rcutorture_booster_init(unsigned int cpu)
> if (boost_tasks[cpu] != NULL)
> return 0; /* Already created, nothing more to do. */
>
> + // Testing RCU priority boosting requires rcutorture do
> + // some serious abuse. Counter this by running ksoftirqd
> + // at higher priority.
> + if (IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)) {
> + struct sched_param sp;
> + struct task_struct *t;
> +
> + t = per_cpu(ksoftirqd, cpu);
> + WARN_ON_ONCE(!t);
> + sp.sched_priority = 2;
> + sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> + }
> +
> /* Don't allow time recalculation while creating a new task. */
> mutex_lock(&boost_mutex);
> rcu_torture_disable_rt_throttle();
> @@ -3384,21 +3397,6 @@ rcu_torture_init(void)
> rcutor_hp = firsterr;
> if (torture_init_error(firsterr))
> goto unwind;
> -
> - // Testing RCU priority boosting requires rcutorture do
> - // some serious abuse. Counter this by running ksoftirqd
> - // at higher priority.
> - if (IS_BUILTIN(CONFIG_RCU_TORTURE_TEST)) {
> - for_each_online_cpu(cpu) {
> - struct sched_param sp;
> - struct task_struct *t;
> -
> - t = per_cpu(ksoftirqd, cpu);
> - WARN_ON_ONCE(!t);
> - sp.sched_priority = 2;
> - sched_setscheduler_nocheck(t, SCHED_FIFO, &sp);
> - }
> - }
> }
> shutdown_jiffies = jiffies + shutdown_secs * HZ;
> firsterr = torture_shutdown_init(shutdown_secs, rcu_torture_cleanup);
> --
> 2.25.1
>
Powered by blists - more mailing lists