[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <174429293068.108.9256249525421383845@patchwork.local>
Date: Thu, 10 Apr 2025 13:48:50 -0000
From: Joel Fernandes <joelagnelf@...dia.com>
To: "Paul E. McKenney" <paulmck@...nel.org>
Cc: Joel Fernandes <joelagnelf@...dia.com>, linux-kernel <linux-kernel@...r.kernel.org>, Frederic Weisbecker <frederic@...nel.org>, Neeraj Upadhyay <neeraj.upadhyay@...nel.org>, Josh Triplett <josh@...htriplett.org>, Boqun Feng <boqun.feng@...il.com>, Uladzislau Rezki <urezki@...il.com>, Steven Rostedt <rostedt@...dmis.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>, Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>, Davidlohr Bueso <dave@...olabs.net>, rcu <rcu@...r.kernel.org>
Subject: Re: rcutorture: Perform more frequent testing of ->gpwrap
Hello, Paul,
On Thu, 10 Apr 2025 13:35:35 GMT, "Paul E. McKenney" wrote:
[...]
> > > > kernel/rcu/rcu.h | 4 +++
> > > > kernel/rcu/rcutorture.c | 64 ++++++++++++++++++++++++++++++++++++++++
> > > > kernel/rcu/tree.c | 34 ++++++++++++++++++++--
> > > > kernel/rcu/tree.h | 1 +
> > > > 4 files changed, 101 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > > > index eed2951a4962..9a15e9701e02 100644
> > > > --- a/kernel/rcu/rcu.h
> > > > +++ b/kernel/rcu/rcu.h
> > > > @@ -572,6 +572,8 @@ void do_trace_rcu_torture_read(const char *rcutort
> > > > unsigned long c_old,
> > > > unsigned long c);
> > > > void rcu_gp_set_torture_wait(int duration);
> > > > +void rcu_set_torture_ovf_lag(unsigned long lag);
> > > > +int rcu_get_gpwrap_count(int cpu);
> > > > #else
> > > > static inline void rcutorture_get_gp_data(int *flags, unsigned long *
> > > > {
> > > > @@ -589,6 +591,8 @@ void do_trace_rcu_torture_read(const char *rcutort
> > > > do { } while (0)
> > > > #endif
> > > > static inline void rcu_gp_set_torture_wait(int duration) { }
> > > > +static inline void rcu_set_torture_ovf_lag(unsigned long lag) { }
> > > > +static inline int rcu_get_gpwrap_count(int cpu) { return 0; }
> > >
> > > Very good, you did remember CONFIG_SMP=n. And yes, I did try it. ;-)
> > >
> > > But shouldn't these be function pointers in rcu_torture_ops? That way if
> > > some other flavor of RCU starts doing wrap protection for its grace-period
> > > sequence numbers, this testing can apply directly to that flavor as well.
> >
> > These are here because 'rdp' is not accessible AFAIK from rcutorture.c.
> > I could add wrappers to these and include them as pointers the a struct as well.
> > But I think these will still stay to access rdp.
>
> Why not just pass in the CPU number and let the pointed-to function find
> the rdp?
Great, I did this and it looks good now, will post v2 shortly.
> > > I could argue with the defaults, but I run long tests often enough that
> > > I am not worried about coverage. As long as we remember to either run
> > > long tests or specify appropriate rcutorture.ovf_cycle_mins when messing
> > > with ->gpwrap code.
> > >
> > > > torture_param(int, nocbs_nthreads, 0, "Number of NOCB toggle threads,
> > > > torture_param(int, nocbs_toggle, 1000, "Time between toggling nocb st
> ate (ms)");
> > > > torture_param(int, preempt_duration, 0, "Preemption duration (ms), ze
> > > > @@ -2629,6 +2632,7 @@ rcu_torture_stats_print(void)
> > > > int i;
> > > > long pipesummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 };
> > > > long batchsummary[RCU_TORTURE_PIPE_LEN + 1] = { 0 };
> > > > + long n_gpwraps = 0;
> > > > struct rcu_torture *rtcp;
> > > > static unsigned long rtcv_snap = ULONG_MAX;
> > > > static bool splatted;
> > > > @@ -2639,6 +2643,7 @@ rcu_torture_stats_print(void)
> > > > pipesummary[i] += READ_ONCE(per_cpu(rcu_torture_count, cpu)[i]);
> > > > batchsummary[i] += READ_ONCE(per_cpu(rcu_torture_batch, cpu)[i]);
> > > > }
> > > > + n_gpwraps += rcu_get_gpwrap_count(cpu);
> > > > }
> > > > for (i = RCU_TORTURE_PIPE_LEN; i >= 0; i--) {
> > > > if (pipesummary[i] != 0)
> > > > @@ -2672,6 +2677,7 @@ rcu_torture_stats_print(void)
> > > > pr_cont("read-exits: %ld ", data_race(n_read_exits)); // Statistic.
> > > > pr_cont("nocb-toggles: %ld:%ld\n",
> > >
> > > The "\n" on the above line needs to be deleted.
> >
> > Ok.
Done.
> >
> > > > atomic_long_read(&n_nocb_offload), atomic_long_read(&n_nocb_deofflo
> > > > + pr_cont("gpwraps: %ld\n", n_gpwraps);
> > > >
> > > > pr_alert("%s%s ", torture_type, TORTURE_FLAG);
> > > > if (atomic_read(&n_rcu_torture_mberror) ||
> > > > @@ -3842,6 +3848,58 @@ static int rcu_torture_preempt(void *unused)
> > > >
> > > > static enum cpuhp_state rcutor_hp;
> > > >
> > > > +static struct hrtimer ovf_lag_timer;
> > > > +static bool ovf_lag_active;
> > >
> > > Same "ovf" naming complaint as before.
> >
> > Ok.
Done.
> >
> > > > +}
> > > > +
> > > > +static int rcu_torture_ovf_lag_init(void)
> > > > +{
> > > > + if (ovf_cycle_mins <= 0 || ovf_active_mins <= 0) {
> > > > + pr_alert("rcu-torture: lag timing parameters must be positive\n");
> > > > + return -EINVAL;
> > > > + }
> > >
> > > Why not refuse to start this portion of the test when CONFIG_SMP =n
> > > or something other than vanilla RCU? No need to fail the test, just
> > > print something saying that this testing won't be happening.
> >
> > Got it, will do.
>
> Again, thank you!
I changed this to, something like:
if (cur_ops->set_gpwrap_lag && rcu_gpwrap_lag_init())
goto unwind;
So it will only test RCU flavor.
However, to your point - for TINY, we would still start the timer which will be
a NOOP. But considering it is 5 minutes every 30 minutes, maybe that's Ok? ;-)
We could also have a ops pointer ->can_test_gpwrap_lag() which returns FALSE
for Tiny, but then it is adding more ops pointers.
thanks,
- Joel
Powered by blists - more mailing lists