lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 5 Apr 2024 15:49:46 +0200
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Yan Zhai <yan@...udflare.com>
Cc: paulmck@...nel.org, netdev@...r.kernel.org,
	"David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>,
	Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
	Jiri Pirko <jiri@...nulli.us>, Simon Horman <horms@...nel.org>,
	Daniel Borkmann <daniel@...earbox.net>,
	Lorenzo Bianconi <lorenzo@...nel.org>,
	Coco Li <lixiaoyan@...gle.com>, Wei Wang <weiwan@...gle.com>,
	Alexander Duyck <alexanderduyck@...com>,
	linux-kernel@...r.kernel.org, rcu@...r.kernel.org,
	bpf@...r.kernel.org, kernel-team@...udflare.com,
	Joel Fernandes <joel@...lfernandes.org>,
	Toke Høiland-Jørgensen <toke@...hat.com>,
	Alexei Starovoitov <alexei.starovoitov@...il.com>,
	Steven Rostedt <rostedt@...dmis.org>, mark.rutland@....com,
	Jesper Dangaard Brouer <hawk@...nel.org>,
	Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [PATCH v5 net 1/3] rcu: add a helper to report consolidated
 flavor QS

On 2024-03-22 21:02:02 [-0500], Yan Zhai wrote:
> On Fri, Mar 22, 2024 at 4:31 PM Paul E. McKenney <paulmck@...nel.org> wrote:
> >
> > On Fri, Mar 22, 2024 at 12:24:13PM +0100, Sebastian Andrzej Siewior wrote:
> > > On 2024-03-19 13:44:34 [-0700], Yan Zhai wrote:
> > > > + * The macro is not needed when CONFIG_PREEMPT_RT is defined. RT kernels would
> > > > + * have more chance to invoke schedule() calls and provide necessary quiescent
> > > > + * states. As a contrast, calling cond_resched() only won't achieve the same
> > > > + * effect because cond_resched() does not provide RCU-Tasks quiescent states.
> > > > + */
> > >
> > > Paul, so CONFIG_PREEMPTION is affected but CONFIG_PREEMPT_RT is not.
> > > Why does RT have more scheduling points?
> >
> > In RT, isn't BH-disabled code preemptible?  But yes, this would not help
> > RCU Tasks.
Yes, it is but why does it matter? This is used in the NAPI thread which
fully preemptible and does cond_resched(). This should be enough.

> By "more chance to invoke schedule()", my thought was that
> cond_resched becomes no op on RT or PREEMPT kernel. So it will not
> call __schedule(SM_PEREEMPT), which clears the NEED_RESCHED flag. On a
It will nop cond_resched(), correct. However once something sends
NEED_RESCHED then the receiver of this flag will __schedule(SM_PEREEMPT)
as soon as possible. That is either because the scheduler sends an IPI
and the CPU will do it in the irq-exit path _or_ the thread does
preempt_enable() (which includes local_bh_enable()) and the counter hits
zero at which point the same context switch happens.

Therefore I don't see a difference between CONFIG_PREEMPT and
CONFIG_PREEMPT_RT.

> normal irq exit like timer, when NEED_RESCHED is on,
> schedule()/__schedule(0) can be called time by time then.

This I can not parse. __schedule(0) means the task gives up on its own
and goes to sleep. This does not happen for the NAPI-thread loop,
kworker loop or any other loop that consumes one work item after the
other and relies on cond_resched() in between.

> __schedule(0) is good for RCU tasks, __schedule(SM_PREEMPT) is not.
Okay and that is why? This means you expect that every thread gives up
on its own which may take some time depending on the workload. This
should not matter.

If I see this right, the only difference is rcu_tasks_classic_qs() and I
didn't figure out yet what it does.

> But I think this code comment does not take into account frequent
> preempt_schedule and irqentry_exit_cond_resched on a PREEMPT kernel.
> When returning to these busy kthreads, irqentry_exit_cond_resched is
> in fact called now, not schedule(). So likely __schedule(PREEMPT) is
> still called frequently, or even more frequently. So the code comment
> looks incorrect on the RT argument part. We probably should remove the
> "IS_ENABLED" condition really. Paul and Sebastian, does this sound
> reasonable to you?

Can you walk me through it? Why is it so important for a task to give up
voluntary? There is something wrong here with how RCU tasks works.
We want to get rid of the sprinkled cond_resched(). This looks like a
another variant of it that might be required in places with no
explanation except it takes too long. 

> Yan

Sebastian

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ