[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20120414161953.GA18140@linux.vnet.ibm.com>
Date: Sat, 14 Apr 2012 09:19:53 -0700
From: "Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
To: linux-kernel@...r.kernel.org
Cc: mingo@...e.hu, laijs@...fujitsu.com, dipankar@...ibm.com,
akpm@...ux-foundation.org, mathieu.desnoyers@...icios.com,
josh@...htriplett.org, niv@...ibm.com, tglx@...utronix.de,
peterz@...radead.org, rostedt@...dmis.org, Valdis.Kletnieks@...edu,
dhowells@...hat.com, eric.dumazet@...il.com, darren@...art.com,
fweisbec@...il.com, patches@...aro.org,
torvalds@...ux-foundation.org
Subject: [PATCH RFC 0/7] rcu: v2 Inlinable preemptible rcu_read_lock() and
rcu_read_unlock()
Hello!
This series is version two of the inlinable versions of preemptible
RCU's __rcu_read_lock() and __rcu_read_unlock(). The first version may
be found at https://lkml.org/lkml/2012/3/25/94. The individual commits
in this new series are as follows:
1. Move preemptible RCU's hook in the scheduler from the common
RCU scheduler-entry hook to just before the scheduler's call
to switch_to. This reduces overhead in the case where the
scheduler is called but does not switch and also sets the
stage for saving and restoring the per-CPU variables needed
for inlining.
2. Create the per-CPU variables and rename rcu_read_unlock_special()
to avoid name conflict.
3. Make exit_rcu() use a more precise method of checking the need
for exit-time RCU-related cleanup, and consolidate the two
identical versions of exit_rcu() into one place.
4. Make __rcu_read_lock() and __rcu_read_unlock() use the per-CPU
variables, but leave them out of line for the moment. This
requires adding a second preemptible-RCU hook in the scheduler
to restore the values of the per-CPU variables.
5. Silence bogus copy_to_user() build errors that seem to be triggered
by differences in gcc's inlining decisions when __rcu_read_lock()
becomes inlinable. Apparently, copy_to_user() needs to be inlined
in order to function correctly? Hmmm, sort of like kfree_rcu().
6. Inline __rcu_read_lock().
7. Inline __rcu_read_unlock().
With these changes, the 32-bit x86 gcc compiler compiles this:
void rcu_read_lock_code(void)
{
rcu_read_lock();
}
to this:
000000d0 <rcu_read_lock_code>:
d0: 64 ff 05 00 00 00 00 incl %fs:0x0
d7: c3 ret
d8: 90 nop
d9: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
It also compiles this:
void rcu_read_unlock_code(void)
{
rcu_read_unlock();
}
to this:
000000e0 <rcu_read_unlock_code>:
e0: 64 a1 00 00 00 00 mov %fs:0x0,%eax
e6: 83 f8 01 cmp $0x1,%eax
e9: 74 0d je f8 <rcu_read_unlock_code+0x18>
eb: 64 ff 0d 00 00 00 00 decl %fs:0x0
f2: c3 ret
f3: 90 nop
f4: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
f8: 64 c7 05 00 00 00 00 movl $0x80000000,%fs:0x0
ff: 00 00 00 80
103: 64 a1 00 00 00 00 mov %fs:0x0,%eax
109: 85 c0 test %eax,%eax
10b: 75 0c jne 119 <rcu_read_unlock_code+0x39>
10d: 64 c7 05 00 00 00 00 movl $0x0,%fs:0x0
114: 00 00 00 00
118: c3 ret
119: 8d b4 26 00 00 00 00 lea 0x0(%esi,%eiz,1),%esi
120: e8 fc ff ff ff call 121 <rcu_read_unlock_code+0x41>
125: eb e6 jmp 10d <rcu_read_unlock_code+0x2d>
It is therefore not at all clear to me that the final patch in this
series is worthwhile. Unless someone comes up with a good reason to
keep it, I will drop it. The only possible justification I can see is
that gcc could (in theory, anyway) drop dead code in the case of nested
RCU read-side critical sections (everything from address f3 onwards),
but this just doesn't cut it for me at the moment. I could also imagine
having the inlined portion contain only the nesting check and decrement,
along with a call to an out-of-line function that does the rest, but
this looks to me to bloat the code for no good reason.
Thoughts?
Thanx, Paul
arch/um/drivers/mconsole_kern.c | 3
b/arch/um/drivers/mconsole_kern.c | 1
b/fs/binfmt_misc.c | 4 -
b/include/linux/init_task.h | 4 -
b/include/linux/rcupdate.h | 1
b/include/linux/rcutiny.h | 6 -
b/include/linux/rcutree.h | 12 ---
b/include/linux/sched.h | 10 +++
b/kernel/rcu.h | 4 +
b/kernel/rcupdate.c | 5 +
b/kernel/rcutiny_plugin.h | 10 +--
b/kernel/rcutree.c | 1
b/kernel/rcutree.h | 1
b/kernel/rcutree_plugin.h | 14 ----
b/kernel/sched/core.c | 1
include/linux/rcupdate.h | 72 ++++++++++++++++++++-
include/linux/rcutiny.h | 5 -
include/linux/sched.h | 92 +++++++++++++++++++++++++--
kernel/rcu.h | 4 -
kernel/rcupdate.c | 126 ++++++++++++++++++++++----------------
kernel/rcutiny_plugin.h | 123 +++++++------------------------------
kernel/rcutree_plugin.h | 114 ++++++++--------------------------
kernel/sched/core.c | 3
23 files changed, 321 insertions(+), 295 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists