[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <04b48040-7fe3-4732-98ae-2ea830832bb7@paulmck-laptop>
Date: Sun, 20 Apr 2025 11:04:11 -0700
From: "Paul E. McKenney" <paulmck@...nel.org>
To: Joel Fernandes <joelagnelf@...dia.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Frederic Weisbecker <frederic@...nel.org>,
Neeraj Upadhyay <neeraj.upadhyay@...nel.org>,
Joel Fernandes <joel@...lfernandes.org>,
Josh Triplett <josh@...htriplett.org>,
Boqun Feng <boqun.feng@...il.com>,
Uladzislau Rezki <urezki@...il.com>,
Steven Rostedt <rostedt@...dmis.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>,
Zqiang <qiang.zhang1211@...il.com>,
Davidlohr Bueso <dave@...olabs.net>,
"rcu@...r.kernel.org" <rcu@...r.kernel.org>
Subject: Re: [v3,1/2] rcutorture: Perform more frequent testing of ->gpwrap
On Sun, Apr 20, 2025 at 02:40:20AM -0000, Joel Fernandes wrote:
> Hello, Paul,
>
> On April 20, 2025, 12:21 a.m. UTC Paul E. McKenney wrote:
> > On Wed, Apr 16, 2025 at 11:19:22AM +0000, Joel Fernandes wrote:
> > >
> > >
> > > > On Apr 15, 2025, at 8:19 PM, Paul E. McKenney <paulmck@...nel.org> wrote:
> > > >
> > > > On Mon, Apr 14, 2025 at 11:05:45AM -0400, Joel Fernandes wrote:
> > > >> On 4/10/2025 2:29 PM, Paul E. McKenney wrote:
> > > >>>> +static int rcu_gpwrap_lag_init(void)
> > > >>>> +{
> > > >>>> + if (gpwrap_lag_cycle_mins <= 0 || gpwrap_lag_active_mins <= 0) {
> > > >>>> + pr_alert("rcu-torture: lag timing parameters must be positive\n");
> > > >>>> + return -EINVAL;
> > > >>> When rcutorture is initiated by modprobe, this makes perfect sense.
> > > >>>
> > > >>> But if rcutorture is built in, we have other choices: (1) Disable gpwrap
> > > >>> testing and do other testing but splat so that the bogus scripting can
> > > >>> be fixed, (2) Force default values and splat as before, (3) Splat and
> > > >>> halt the system.
> > > >>>
> > > >>> The usual approach has been #1, but what makes sense in this case?
> > > >>
> > > >> If the user deliberately tries to prevent the test, I am Ok with #3 which I
> > > >> believe is the current behavior. But otherwise #1 is also Ok with me but I don't
> > > >> feel strongly about doing that.
> > > >>
> > > >> If we want to do #3, it will just involve changing the "return -EINVAL" to
> > > >> "return 0" but also may need to be doing so only if RCU torture is a built-in.
> > > >>
> > > >> IMO the current behavior is reasonable than adding more complexity for an
> > > >> unusual case for a built-in?
> > > >
> > > > The danger is that someone adjusts a scenario, accidentally disables
> > > > *all* ->gpwrap testing during built-in tests (kvm.sh, kvm-remote,sh,
> > > > and torture.sh), and nobody notices. This has tripped me up in the
> > > > past, hence the existing splats in rcutorture, but only for runs with
> > > > built-in rcutorture.
> > >
> > > But in the code we are discussing, we will splat with an error if either
> > > parameter is set to 0? Sorry if I missed something.
> >
> > The idea would be to instead splat if the user requested a given type of
> > testing, but that request conflicted with some other setting so that the
> > user's request had to be refused. If the user did not request a given
> > type of testing (so that the corresponding parameter was zero), no splats.
> >
> > Also, no splats of this type for modprobe (error return instead), rather,
> > modprobe gets an error code in this case.
> >
> > Or am I missing the point of your question?
>
> No you are not missing anything. I just felt I already made the change you are
> talking about because if user misconfigures the timing params, it will print an
> error. But if you feel something is missing, I'd appreciate a prototype patch!
OK, I see that you are relying on the splat after the "unwind" label
in rcu_torture_init(), which is perfectly legitimate. Apologies for
my confusion!
> > > >> On the other hand if the issue is with providing the user with a way to disable
> > > >> gpwrap testing, that should IMO be another parameter than setting the _mins
> > > >> parameters to be 0. But I think we may not want this testing disabled since it
> > > >> is already "self-disabled" for the first 25 miutes.
> > > >
> > > > We do need a way of disabling the testing on long runs for fault-isolation
> > > > purposes.
> > >
> > > Thanks, I will add an option for this.
>
> I still have to fix this, and will add it to the other fix we needed to make
> because of the issue you found (kthread_should_stop() splat).
Very good, and thank you!
Thanx, Paul
Powered by blists - more mailing lists