lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAO3-Pbp6fCayWeJ11U6JtqHn-Rs3OFXoZ9uMohUefSYUvSGUKA@mail.gmail.com>
Date: Mon, 18 Mar 2024 21:32:47 -0500
From: Yan Zhai <yan@...udflare.com>
To: Mark Rutland <mark.rutland@....com>
Cc: "Paul E. McKenney" <paulmck@...nel.org>, netdev@...r.kernel.org, 
	"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>, 
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>, Jiri Pirko <jiri@...nulli.us>, 
	Simon Horman <horms@...nel.org>, Daniel Borkmann <daniel@...earbox.net>, 
	Lorenzo Bianconi <lorenzo@...nel.org>, Coco Li <lixiaoyan@...gle.com>, Wei Wang <weiwan@...gle.com>, 
	Alexander Duyck <alexanderduyck@...com>, Hannes Frederic Sowa <hannes@...essinduktion.org>, 
	linux-kernel@...r.kernel.org, rcu@...r.kernel.org, bpf@...r.kernel.org, 
	kernel-team@...udflare.com, Joel Fernandes <joel@...lfernandes.org>, 
	Toke Hoiland-Jorgensen <toke@...hat.com>, Alexei Starovoitov <alexei.starovoitov@...il.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Jesper Dangaard Brouer <hawk@...nel.org>
Subject: Re: [PATCH v4 net 1/3] rcu: add a helper to report consolidated
 flavor QS

On Mon, Mar 18, 2024 at 5:59 AM Mark Rutland <mark.rutland@....com> wrote:
>
> On Fri, Mar 15, 2024 at 10:40:56PM -0700, Paul E. McKenney wrote:
> > On Fri, Mar 15, 2024 at 12:55:03PM -0700, Yan Zhai wrote:
> > > There are several scenario in network processing that can run
> > > extensively under heavy traffic. In such situation, RCU synchronization
> > > might not observe desired quiescent states for indefinitely long period.
> > > Create a helper to safely raise the desired RCU quiescent states for
> > > such scenario.
> > >
> > > Currently the frequency is locked at HZ/10, i.e. 100ms, which is
> > > sufficient to address existing problems around RCU tasks. It's unclear
> > > yet if there is any future scenario for it to be further tuned down.
> >
> > I suggest something like the following for the commit log:
> >
> > ------------------------------------------------------------------------
> >
> > When under heavy load, network processing can run CPU-bound for many tens
> > of seconds.  Even in preemptible kernels, this can block RCU Tasks grace
> > periods, which can cause trace-event removal to take more than a minute,
> > which is unacceptably long.
> >
> > This commit therefore creates a new helper function that passes
> > through both RCU and RCU-Tasks quiescent states every 100 milliseconds.
> > This hard-coded value suffices for current workloads.
>
> FWIW, this sounds good to me.
>
> >
> > ------------------------------------------------------------------------
> >
> > > Suggested-by: Paul E. McKenney <paulmck@...nel.org>
> > > Reviewed-by: Jesper Dangaard Brouer <hawk@...nel.org>
> > > Signed-off-by: Yan Zhai <yan@...udflare.com>
> > > ---
> > > v3->v4: comment fixup
> > >
> > > ---
> > >  include/linux/rcupdate.h | 24 ++++++++++++++++++++++++
> > >  1 file changed, 24 insertions(+)
> > >
> > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> > > index 0746b1b0b663..da224706323e 100644
> > > --- a/include/linux/rcupdate.h
> > > +++ b/include/linux/rcupdate.h
> > > @@ -247,6 +247,30 @@ do { \
> > >     cond_resched(); \
> > >  } while (0)
> > >
> > > +/**
> > > + * rcu_softirq_qs_periodic - Periodically report consolidated quiescent states
> > > + * @old_ts: last jiffies when QS was reported. Might be modified in the macro.
> > > + *
> > > + * This helper is for network processing in non-RT kernels, where there could
> > > + * be busy polling threads that block RCU synchronization indefinitely.  In
> > > + * such context, simply calling cond_resched is insufficient, so give it a
> > > + * stronger push to eliminate all potential blockage of all RCU types.
> > > + *
> > > + * NOTE: unless absolutely sure, this helper should in general be called
> > > + * outside of bh lock section to avoid reporting a surprising QS to updaters,
> > > + * who could be expecting RCU read critical section to end at local_bh_enable().
> > > + */
> >
> > How about something like this for the kernel-doc comment?
> >
> > /**
> >  * rcu_softirq_qs_periodic - Report RCU and RCU-Tasks quiescent states
> >  * @old_ts: jiffies at start of processing.
> >  *
> >  * This helper is for long-running softirq handlers, such as those
> >  * in networking.  The caller should initialize the variable passed in
> >  * as @old_ts at the beginning of the softirq handler.  When invoked
> >  * frequently, this macro will invoke rcu_softirq_qs() every 100
> >  * milliseconds thereafter, which will provide both RCU and RCU-Tasks
> >  * quiescent states.  Note that this macro modifies its old_ts argument.
> >  *
> >  * Note that although cond_resched() provides RCU quiescent states,
> >  * it does not provide RCU-Tasks quiescent states.
> >  *
> >  * Because regions of code that have disabled softirq act as RCU
> >  * read-side critical sections, this macro should be invoked with softirq
> >  * (and preemption) enabled.
> >  *
> >  * This macro has no effect in CONFIG_PREEMPT_RT kernels.
> >  */
>
> Considering the note about cond_resched(), does does cond_resched() actually
> provide an RCU quiescent state for fully-preemptible kernels? IIUC for those
> cond_resched() expands to:
>
>         __might_resched();
>         klp_sched_try_switch()
>
> ... and AFAICT neither reports an RCU quiescent state.
>
> So maybe it's worth dropping the note?
>
> Seperately, what's the rationale for not doing this on PREEMPT_RT? Does that
> avoid the problem through other means, or are people just not running effected
> workloads on that?
>
It's a bit anti-intuition but yes the RT kernel avoids the problem.
This is because "schedule()" reports task RCU QS actually, and on RT
kernel cond_resched() call won't call "__cond_resched()" or
"__schedule(PREEMPT)" as you already pointed out, which would clear
need-resched flag. This then allows "schedule()" to be called on hard
IRQ exit time by time.

Yan

> Mark.
>
> >
> >                                                       Thanx, Paul
> >
> > > +#define rcu_softirq_qs_periodic(old_ts) \
> > > +do { \
> > > +   if (!IS_ENABLED(CONFIG_PREEMPT_RT) && \
> > > +       time_after(jiffies, (old_ts) + HZ / 10)) { \
> > > +           preempt_disable(); \
> > > +           rcu_softirq_qs(); \
> > > +           preempt_enable(); \
> > > +           (old_ts) = jiffies; \
> > > +   } \
> > > +} while (0)
> > > +
> > >  /*
> > >   * Infrastructure to implement the synchronize_() primitives in
> > >   * TREE_RCU and rcu_barrier_() primitives in TINY_RCU.
> > > --
> > > 2.30.2
> > >
> > >

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ