lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2a0d52a5-5c28-498a-8df7-789f020e36ed@paulmck-laptop>
Date:   Fri, 27 Oct 2023 14:23:56 -0700
From:   "Paul E. McKenney" <paulmck@...nel.org>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Frederic Weisbecker <frederic@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Boqun Feng <boqun.feng@...il.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Josh Triplett <josh@...htriplett.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        Neeraj Upadhyay <neeraj.upadhyay@....com>,
        Steven Rostedt <rostedt@...dmis.org>,
        Uladzislau Rezki <urezki@...il.com>, rcu <rcu@...r.kernel.org>,
        Zqiang <qiang.zhang1211@...il.com>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>
Subject: Re: [PATCH 2/4] rcu/tasks: Handle new PF_IDLE semantics

On Fri, Oct 27, 2023 at 09:20:26PM +0200, Peter Zijlstra wrote:
> On Fri, Oct 27, 2023 at 04:40:48PM +0200, Frederic Weisbecker wrote:
> 
> > +	/* Has the task been seen voluntarily sleeping? */
> > +	if (!READ_ONCE(t->on_rq))
> > +		return false;
> 
> > -	if (t != current && READ_ONCE(t->on_rq) && !is_idle_task(t)) {
> 
> AFAICT this ->on_rq usage is outside of scheduler locks and that
> READ_ONCE isn't going to help much.
> 
> Obviously a pre-existing issue, and I suppose all it cares about is
> seeing a 0 or not, irrespective of the races, but urgh..

The trick is that RCU Tasks only needs to spot a task voluntarily blocked
once at any point in the grace period.  The beginning and end of the
grace-period process have full barriers, so if this code sees t->on_rq
equal to zero, we know that the task was voluntarily blocked at some
point during the grace period, as required.

In theory, we could acquire a scheduler lock, but in practice this would
cause CPU-latency problems at a certain set of large datacenters, and
for once, not the datacenters operated by my employer.

In theory, we could make separate lists of tasks that we need to wait on,
thus avoiding the need to scan the full task list, but in practice this
would require a synchronized linked-list operation on every voluntary
context switch, both in and out.

In theory, the task list could sharded, so that it could be scanned
incrementally, but in practice, this is a bit non-trivial.  Though this
particular use case doesn't care about new tasks, so it could live with
something simpler than would be required for certain types of signal
delivery.

In theory, we could place rcu_segcblist-like mid pointers into the
task list, so that scans could restart from any mid pointer.  Care is
required because the mid pointers would likely need to be recycled as
new tasks are added.  Plus care is needed because it has been a good
long time since I have looked at the code managing the tasks list,
and I am probably woefully out of date on how it all works.

So, is there a better way?

							Thanx, Paul

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ