[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200703152013.17178.dada1@cosmosbay.com>
Date: Thu, 15 Mar 2007 20:13:16 +0100
From: Eric Dumazet <dada1@...mosbay.com>
To: Nick Piggin <nickpiggin@...oo.com.au>,
Ulrich Drepper <drepper@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Ingo Molnar <mingo@...e.hu>
Cc: Andi Kleen <ak@...e.de>,
Ravikiran G Thirumalai <kiran@...lex86.org>,
"Shai Fultheim (Shai@...lex86.org)" <shai@...lex86.org>,
pravin b shelar <pravin.shelar@...softinc.com>,
linux-kernel@...r.kernel.org
Subject: [PATCH 1/3] FUTEX : introduce PROCESS_PRIVATE semantic
[PATCH 1/3] FUTEX : introduce PROCESS_PRIVATE semantic
This first patch introduces XXX_PRIVATE futexes operations.
When a process uses a XXX_PRIVATE futex primitive, kernel can avoid
to take a read lock on mmap_sem, to find the vma that contains the futex,
to learn if it is associated to an inode (shared) or the mm (private to
process)
We also avoid taking a reference on the found inode or the mm.
Even if mmap_sem is a rw_semaphore, up_read()/down_read() are doing atomic
ops on mmap_sem, dirtying cache line :
- lot of cache line ping pongs on SMP configurations.
mmap_sem is also extensively used by mm code (page faults, mmap()/munmap())
Highly threaded processes might suffer from mmap_sem contention.
mmap_sem is also used by oprofile code. Enabling oprofile hurts threaded
programs because of contention on the mmap_sem cache line.
- Using an atomic_inc()/atomic_dec() on inode ref counter or mm ref counter:
It's also a cache line ping pong on SMP. It also increases mmap_sem hold time
because of cache misses.
This first patch is possible because, for one process using
PTHREAD_PROCESS_PRIVATE futexes, we only need to distinguish futexes by their
virtual address, no matter the underlying mm storage is. The case of multiple
virtual addresses mapped on the same physical address is just insane : "Dont
do it on PROCESS_PRIVATE futexes, please ?"
If glibc wants to exploit this new infrastructure, it should use new
_PRIVATE futex subcommands for PTHREAD_PROCESS_PRIVATE futexes. And
be prepared to fallback on old subcommands for old kernels. Using one
global variable with the FUTEX_PRIVATE_FLAG or 0 value should be OK, so that
only one syscall might fail.
Compatibility with old applications is preserved, they still hit the
scalability problems, but new applications can fly :)
Note : SHARED futexes can be used by old binaries *and* new binaries,
because both binaries will use the old subcommands.
Note : Vast majority of futexes should be using PROCESS_PRIVATE semantic,
as this is the default semantic. Almost all applications should benefit
of this changes (new kernel and updated libc)
Signed-off-by: Eric Dumazet <dada1@...mosbay.com>
---
include/linux/futex.h | 12 +
kernel/futex.c | 273 +++++++++++++++++++++++++---------------
2 files changed, 188 insertions(+), 97 deletions(-)
View attachment "futex_p1.patch" of type "text/plain" (20079 bytes)
Powered by blists - more mailing lists