[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5a887ca6-10e9-4026-b792-164deb80d0a8@paulmck-laptop>
Date: Mon, 20 Mar 2023 16:35:34 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: "Zhang, Qiang1" <qiang1.zhang@...el.com>
Cc: "frederic@...nel.org" <frederic@...nel.org>,
"joel@...lfernandes.org" <joel@...lfernandes.org>,
"rcu@...r.kernel.org" <rcu@...r.kernel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2] rcutorture: Convert
schedule_timeout_uninterruptible() to mdelay() in rcu_torture_stall()
On Mon, Mar 20, 2023 at 11:05:17PM +0000, Zhang, Qiang1 wrote:
> > For kernels built with enable PREEMPT_NONE and CONFIG_DEBUG_ATOMIC_SLEEP,
> > running the RCU stall tests.
> >
> > runqemu kvm slirp nographic qemuparams="-m 1024 -smp 4"
> > bootparams="nokaslr console=ttyS0 rcutorture.stall_cpu=30
> > rcutorture.stall_no_softlockup=1 rcutorture.stall_cpu_irqsoff=1
> > rcutorture.stall_cpu_block=1" -d
> >
> > [ 10.841071] rcu-torture: rcu_torture_stall begin CPU stall
> > [ 10.841073] rcu_torture_stall start on CPU 3.
> > [ 10.841077] BUG: scheduling while atomic: rcu_torture_sta/66/0x0000000
> > ....
> > [ 10.841108] Call Trace:
> > [ 10.841110] <TASK>
> > [ 10.841112] dump_stack_lvl+0x64/0xb0
> > [ 10.841118] dump_stack+0x10/0x20
> > [ 10.841121] __schedule_bug+0x8b/0xb0
> > [ 10.841126] __schedule+0x2172/0x2940
> > [ 10.841157] schedule+0x9b/0x150
> > [ 10.841160] schedule_timeout+0x2e8/0x4f0
> > [ 10.841192] schedule_timeout_uninterruptible+0x47/0x50
> > [ 10.841195] rcu_torture_stall+0x2e8/0x300
> > [ 10.841199] kthread+0x175/0x1a0
> > [ 10.841206] ret_from_fork+0x2c/0x50
> >
> > The above calltrace occurs in the local_irq_disable/enable() critical
> > section call schedule_timeout(), and invoke schedule_timeout() also
> > implies a quiescent state, of course it also fails to trigger RCU stall,
> > this commit therefore use mdelay() instead of schedule_timeout() to
> > trigger RCU stall.
> >
> > Suggested-by: Joel Fernandes <joel@...lfernandes.org>
> > Signed-off-by: Zqiang <qiang1.zhang@...el.com>
> > ---
> > kernel/rcu/rcutorture.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
> > index d06c2da04c34..a08a72bef5f1 100644
> > --- a/kernel/rcu/rcutorture.c
> > +++ b/kernel/rcu/rcutorture.c
> > @@ -2472,7 +2472,7 @@ static int rcu_torture_stall(void *args)
> >
> >Right here there is:
> >
> > if (stall_cpu_block) {
> >
> >In other words, the rcutorture.stall_cpu_block module parameter says to
> >block, even if it is a bad thing to do. The point of this is to verify
> >the error messages that are supposed to be printed on the console when
> >this happens.
> >
> > #ifdef CONFIG_PREEMPTION
> > preempt_schedule();
> > #else
> > - schedule_timeout_uninterruptible(HZ);
> > + mdelay(jiffies_to_msecs(HZ));
> >
> >So this really needs to stay schedule_timeout_uninterruptible(HZ).
>
> But invoke schedule_timeout_uninterruptible(HZ) implies a quiescent state,
> this will not cause an RCU stall to occur, and still in the RCU read critical section(PREEMPT_COUNT=y).
>
> It didn't happen RCU stall when I tested with the following parameters for
> rcutorture.stall_cpu=30
> rcutorture.stall_no_softlockup=1
> rcutorture.stall_cpu_irqsoff=1
> rcutorture.stall_cpu_block=1
Understood. If you want that RCU CPU stall in a CONFIG_PREEMPTION=n
kernel, you should not use rcutorture.stall_cpu_block=1.
In a CONFIG_PREEMPTION=y kernel, rcutorture.stall_cpu_block=1 forces
the grace period to be stalled on a task rather than a CPU, exercising
a different part of the RCU CPU stall warning code.
In a CONFIG_PREEMPTION=n kernel, using rcutorture.stall_cpu_block=1
forces the CPU to go through a quiescent state, as you say. It can
also cause lockdep and scheduling-while-atomic complaints, depending on
exactly what type of RCU reader is in effect.
So these are test-the-diagnostics parameters. The mdelay() instead
makes rcutorture.stall_cpu_block=1 do the same thing as does
rcutorture.stall_cpu_block=0 for CONFIG_PREEMPTION=n kernels, right?
Thanx, Paul
> Thanks
> Zqiang
>
> >
> >So should there be a change to kernel-parameters.txt to make it
> >more clear that this is intended behavior?
> >
> > Thanx, Paul
> >
> > #endif
> > } else if (stall_no_softlockup) {
> > touch_softlockup_watchdog();
> > --
> > 2.25.1
> >
Powered by blists - more mailing lists