[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191108173553.lxsdic6wa4y3ifsr@linutronix.de>
Date: Fri, 8 Nov 2019 18:35:53 +0100
From: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To: Dennis Zhou <dennis@...nel.org>
Cc: linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
Christoph Lameter <cl@...ux.com>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <peterz@...radead.org>,
"Paul E. McKenney" <paulmck@...nel.org>
Subject: [PATCH v2] percpu-refcount: Use normal instead of RCU-sched"
This is a revert of commit
a4244454df129 ("percpu-refcount: use RCU-sched insted of normal RCU")
which claims the only reason for using RCU-sched is
"rcu_read_[un]lock() … are slightly more expensive than preempt_disable/enable()"
and
"As the RCU critical sections are extremely short, using sched-RCU
shouldn't have any latency implications."
The problem with using RCU-sched here is that it disables preemption and
the release callback (called from percpu_ref_put_many()) must not
acquire any sleeping locks like spinlock_t. This breaks PREEMPT_RT
because some of the users acquire spinlock_t locks in their callbacks.
Using rcu_read_lock() on PREEMPTION=n kernels is not any different
compared to rcu_read_lock_sched(). On PREEMPTION=y kernels there are
already performance issues due to additional preemption points.
Looking at the code, the rcu_read_lock() is just an increment and unlock
is almost just a decrement unless there is something special to do. Both
are functions while disabling preemption is inlined.
Doing a small benchmark, the minimal amount of time required was mostly
the same. The average time required was higher due to the higher MAX
value (which could be preemption). With DEBUG_PREEMPT=y it is
rcu_read_lock_sched() that takes a little longer due to the additional
debug code.
Convert back to normal RCU.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
---
On 2019-11-07 12:36:53 [-0500], Dennis Zhou wrote:
> > some RCU section here invoke callbacks which acquire spinlock_t locks.
> > This does not work on RT with disabled preemption.
> >
>
> Yeah, so adding a bit in the commit message about why it's an issue for
> RT kernels with disabled preemption as I don't believe this is an issue
> for non-RT kernels.
I realized that I had partly in the commit message so I rewrote the
second chapter hopefully covering it all now more explicit.
v1…v2: Slightly rewriting the second paragraph regarding RT
implications.
include/linux/percpu-refcount.h | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 7aef0abc194a2..390031e816dcd 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -186,14 +186,14 @@ static inline void percpu_ref_get_many(struct percpu_ref *ref, unsigned long nr)
{
unsigned long __percpu *percpu_count;
- rcu_read_lock_sched();
+ rcu_read_lock();
if (__ref_is_percpu(ref, &percpu_count))
this_cpu_add(*percpu_count, nr);
else
atomic_long_add(nr, &ref->count);
- rcu_read_unlock_sched();
+ rcu_read_unlock();
}
/**
@@ -223,7 +223,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
unsigned long __percpu *percpu_count;
bool ret;
- rcu_read_lock_sched();
+ rcu_read_lock();
if (__ref_is_percpu(ref, &percpu_count)) {
this_cpu_inc(*percpu_count);
@@ -232,7 +232,7 @@ static inline bool percpu_ref_tryget(struct percpu_ref *ref)
ret = atomic_long_inc_not_zero(&ref->count);
}
- rcu_read_unlock_sched();
+ rcu_read_unlock();
return ret;
}
@@ -257,7 +257,7 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
unsigned long __percpu *percpu_count;
bool ret = false;
- rcu_read_lock_sched();
+ rcu_read_lock();
if (__ref_is_percpu(ref, &percpu_count)) {
this_cpu_inc(*percpu_count);
@@ -266,7 +266,7 @@ static inline bool percpu_ref_tryget_live(struct percpu_ref *ref)
ret = atomic_long_inc_not_zero(&ref->count);
}
- rcu_read_unlock_sched();
+ rcu_read_unlock();
return ret;
}
@@ -285,14 +285,14 @@ static inline void percpu_ref_put_many(struct percpu_ref *ref, unsigned long nr)
{
unsigned long __percpu *percpu_count;
- rcu_read_lock_sched();
+ rcu_read_lock();
if (__ref_is_percpu(ref, &percpu_count))
this_cpu_sub(*percpu_count, nr);
else if (unlikely(atomic_long_sub_and_test(nr, &ref->count)))
ref->release(ref);
- rcu_read_unlock_sched();
+ rcu_read_unlock();
}
/**
--
2.24.0
Powered by blists - more mailing lists