[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <174511682031.107.5797546463429118570@patchwork.local>
Date: Sun, 20 Apr 2025 02:40:20 -0000
From: Joel Fernandes <joelagnelf@...dia.com>
To: "Paul E. McKenney" <paulmck@...nel.org>, Joel Fernandes <joelagnelf@...dia.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Frederic Weisbecker <frederic@...nel.org>, Neeraj Upadhyay <neeraj.upadhyay@...nel.org>, Joel Fernandes <joel@...lfernandes.org>, Josh Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>, Uladzislau Rezki <urezki@...il.com>, Steven Rostedt <rostedt@...dmis.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>, Davidlohr Bueso <dave@...olabs.net>, "rcu@...r.kernel.org" <rcu@...r.kernel.org>
Subject: Re: [v3,1/2] rcutorture: Perform more frequent testing of ->gpwrap
Hello, Paul,
On April 20, 2025, 12:21 a.m. UTC Paul E. McKenney wrote:
> On Wed, Apr 16, 2025 at 11:19:22AM +0000, Joel Fernandes wrote:
> >
> >
> > > On Apr 15, 2025, at 8:19 PM, Paul E. McKenney <paulmck@...nel.org> wrote:
> > >
> > > On Mon, Apr 14, 2025 at 11:05:45AM -0400, Joel Fernandes wrote:
> > >> On 4/10/2025 2:29 PM, Paul E. McKenney wrote:
> > >>>> +static int rcu_gpwrap_lag_init(void)
> > >>>> +{
> > >>>> + if (gpwrap_lag_cycle_mins <= 0 || gpwrap_lag_active_mins <= 0) {
> > >>>> + pr_alert("rcu-torture: lag timing parameters must be positive\n");
> > >>>> + return -EINVAL;
> > >>> When rcutorture is initiated by modprobe, this makes perfect sense.
> > >>>
> > >>> But if rcutorture is built in, we have other choices: (1) Disable gpwrap
> > >>> testing and do other testing but splat so that the bogus scripting can
> > >>> be fixed, (2) Force default values and splat as before, (3) Splat and
> > >>> halt the system.
> > >>>
> > >>> The usual approach has been #1, but what makes sense in this case?
> > >>
> > >> If the user deliberately tries to prevent the test, I am Ok with #3 which I
> > >> believe is the current behavior. But otherwise #1 is also Ok with me but I don't
> > >> feel strongly about doing that.
> > >>
> > >> If we want to do #3, it will just involve changing the "return -EINVAL" to
> > >> "return 0" but also may need to be doing so only if RCU torture is a built-in.
> > >>
> > >> IMO the current behavior is reasonable than adding more complexity for an
> > >> unusual case for a built-in?
> > >
> > > The danger is that someone adjusts a scenario, accidentally disables
> > > *all* ->gpwrap testing during built-in tests (kvm.sh, kvm-remote,sh,
> > > and torture.sh), and nobody notices. This has tripped me up in the
> > > past, hence the existing splats in rcutorture, but only for runs with
> > > built-in rcutorture.
> >
> > But in the code we are discussing, we will splat with an error if either
> > parameter is set to 0? Sorry if I missed something.
>
> The idea would be to instead splat if the user requested a given type of
> testing, but that request conflicted with some other setting so that the
> user's request had to be refused. If the user did not request a given
> type of testing (so that the corresponding parameter was zero), no splats.
>
> Also, no splats of this type for modprobe (error return instead), rather,
> modprobe gets an error code in this case.
>
> Or am I missing the point of your question?
No you are not missing anything. I just felt I already made the change you are
talking about because if user misconfigures the timing params, it will print an
error. But if you feel something is missing, I'd appreciate a prototype patch!
> > >> On the other hand if the issue is with providing the user with a way to disable
> > >> gpwrap testing, that should IMO be another parameter than setting the _mins
> > >> parameters to be 0. But I think we may not want this testing disabled since it
> > >> is already "self-disabled" for the first 25 miutes.
> > >
> > > We do need a way of disabling the testing on long runs for fault-isolation
> > > purposes.
> >
> > Thanks, I will add an option for this.
I still have to fix this, and will add it to the other fix we needed to make
because of the issue you found (kthread_should_stop() splat).
thanks,
- Joel
Powered by blists - more mailing lists