[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2062731308.28584.1584294305768.JavaMail.zimbra@efficios.com>
Date: Sun, 15 Mar 2020 13:45:05 -0400 (EDT)
From: Mathieu Desnoyers <mathieu.desnoyers@...icios.com>
To: paulmck <paulmck@...nel.org>
Cc: Frederic Weisbecker <frederic@...nel.org>,
rcu <rcu@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
kernel-team <kernel-team@...com>, Ingo Molnar <mingo@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
dipankar <dipankar@...ibm.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Josh Triplett <josh@...htriplett.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
rostedt <rostedt@...dmis.org>,
David Howells <dhowells@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
fweisbec <fweisbec@...il.com>, Oleg Nesterov <oleg@...hat.com>,
"Joel Fernandes, Google" <joel@...lfernandes.org>
Subject: Re: [PATCH RFC tip/core/rcu 0/16] Prototype RCU usable from idle,
exception, offline
----- On Mar 13, 2020, at 11:42 AM, paulmck paulmck@...nel.org wrote:
> On Fri, Mar 13, 2020 at 03:41:46PM +0100, Frederic Weisbecker wrote:
>> On Thu, Mar 12, 2020 at 11:16:18AM -0700, Paul E. McKenney wrote:
>> > Hello!
>> >
>> > This series provides two variants of Tasks RCU, a rude variant inspired
>> > by Steven Rostedt's use of schedule_on_each_cpu(), and a tracing variant
>> > requested by the BPF folks and perhaps also of use for other tracing
>> > use cases.
>> >
>> > The tracing variant has explicit read-side markers to permit finite grace
>> > periods even given in-kernel loops in PREEMPT=n builds It also protects
>> > code in the idle loop, on exception entry/exit paths, and on the various
>> > CPU-hotplug online/offline code paths, thus having protection properties
>> > similar to SRCU. However, unlike SRCU, this variant avoids expensive
>> > instructions in the read-side primitives, thus having read-side overhead
>> > similar to that of preemptible RCU.
>> >
>> > There are of course downsides. The grace-period code can send IPIs to
>> > CPUs, even when those CPUs are in the idle loop or in nohz_full userspace.
>> > It is necessary to scan the full tasklist, much as for Tasks RCU. There
>> > is a single callback queue guarded by a single lock, again, much as for
>> > Tasks RCU. If needed, these downsides can be at least partially remedied
>>
>> So what we trade to fix the issues we are having with tracing against extended
>> grace periods, we lose in CPU isolation. That worries me a bit as tracing can
>> be thoroughly used with nohz_full and CPU isolation.
>
> First, disturbing nohz_full CPUs can be avoided by the sysadm simply
> refusing to remove tracepoints while sensitive applications are running
> on nohz_full CPUs.
I doubt this approach will survive real-life.
>
> Second, for non-CPU-bound real-time programs with mostly-idle CPUs,
> I should be able to decrease the likelihood of sending IPIs pretty much
> to zero.
>
> Or am I missing something here?
I would recommend considering the following alternative for this tracing-rcu
flavor:
- For all CPUs which are not nohz_full:
- Implement fast RCU read-side which only requires compiler barriers,
- Use IPIs to each of those CPUs when doing a grace period.
- For all nohz_full CPUS:
- Dynamically detect CPUs which are nohz_full,
- Implement slower RCU read-side with memory barriers,
- No need to issue any IPI to those CPUs when doing the grace period.
This should cover all use-cases: staying fast for the common case, without
disturbing RT workloads.
Thoughts ?
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Powered by blists - more mailing lists