[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <1430428337-16802-1-git-send-email-Waiman.Long@hp.com>
Date: Thu, 30 Apr 2015 17:12:15 -0400
From: Waiman Long <Waiman.Long@...com>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>
Cc: linux-kernel@...r.kernel.org, Jason Low <jason.low2@...com>,
Davidlohr Bueso <dave@...olabs.net>,
Scott J Norton <scott.norton@...com>,
Douglas Hatch <doug.hatch@...com>,
Waiman Long <Waiman.Long@...com>
Subject: [PATCH v4 0/2] locking/rwsem: optimize rwsem_wakeup()
v3->v4:
- Break out the active writer check into a separate patch and move
it from __rwsem_do_wake() to rwsem_wake().
- Use smp_rmb() instead of the incorrect smp_mb__after_atomic() as
suggested by PeterZ.
v2->v3:
- Fix errors in commit log.
v1->v2:
- Add a memory barrier before calling spin_trylock for proper memory
ordering.
This patch set aims to reduce spinlock contention in the wait_lock
due to excessive activity in the rwsem_wake code path. This, in turn,
reduces up_write/up_read latency and improve performance when the
rwsem is heavily contended.
On an 8-socket Westmere-EX server (80 cores, HT off), running AIM7's
high_systime workload (1000 users) on a vanilla 4.0 kernel produced
the following perf profile for spinlock contention:
9.23% reaim [kernel.kallsyms] [k] _raw_spin_lock_irqsave
|--97.39%-- rwsem_wake
|--0.69%-- try_to_wake_up
|--0.52%-- release_pages
--1.40%-- [...]
1.70% reaim [kernel.kallsyms] [k] _raw_spin_lock_irq
|--96.61%-- rwsem_down_write_failed
|--2.03%-- __schedule
|--0.50%-- run_timer_softirq
--0.86%-- [...]
Here the contended rwsems are the mmap_sem (mm_struct) and the
i_mmap_rwsem (address_space) with mostly write locking. With a
patched 4.0 kernel, the perf profile became:
1.87% reaim [kernel.kallsyms] [k] _raw_spin_lock_irqsave
|--87.64%-- rwsem_wake
|--2.80%-- release_pages
|--2.56%-- try_to_wake_up
|--1.10%-- __wake_up
|--1.06%-- pagevec_lru_move_fn
|--0.93%-- prepare_to_wait_exclusive
|--0.71%-- free_pid
|--0.58%-- get_page_from_freelist
|--0.57%-- add_device_randomness
--2.04%-- [...]
0.80% reaim [kernel.kallsyms] [k] _raw_spin_lock_irq
|--92.49%-- rwsem_down_write_failed
|--4.24%-- __schedule
|--1.37%-- run_timer_softirq
--1.91%-- [...]
The table below shows the % improvement in throughput (1100-2000 users)
in the various AIM7's workloads:
Workload % increase in throughput
-------- ------------------------
custom 3.8%
five-sec 3.5%
fserver 4.1%
high_systime 22.2%
shared 2.1%
short 10.1%
Waiman Long (2):
locking/rwsem: reduce spinlock contention in wakeup after
up_read/up_write
locking/rwsem: check for active writer before wakeup
include/linux/osq_lock.h | 5 +++
kernel/locking/rwsem-xadd.c | 65 +++++++++++++++++++++++++++++++++++++++++-
2 files changed, 68 insertions(+), 2 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists