[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20110615003600.GA9602@tassilo.jf.intel.com>
Date: Tue, 14 Jun 2011 17:36:00 -0700
From: Andi Kleen <ak@...ux.intel.com>
To: Tim Chen <tim.c.chen@...ux.intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Hugh Dickins <hughd@...gle.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Benjamin Herrenschmidt <benh@...nel.crashing.org>,
David Miller <davem@...emloft.net>,
Martin Schwidefsky <schwidefsky@...ibm.com>,
Russell King <rmk@....linux.org.uk>,
Paul Mundt <lethal@...ux-sh.org>,
Jeff Dike <jdike@...toit.com>,
Richard Weinberger <richard@....at>,
Tony Luck <tony.luck@...el.com>,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Mel Gorman <mel@....ul.ie>, Nick Piggin <npiggin@...nel.dk>,
Namhyung Kim <namhyung@...il.com>, shaohua.li@...el.com,
alex.shi@...el.com, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, "Rafael J. Wysocki" <rjw@...k.pl>
Subject: Re: REGRESSION: Performance regressions from switching
anon_vma->lock to mutex
> On 2.6.39, the contention of anon_vma->lock occupies 3.25% of cpu.
> However, after the switch of the lock to mutex on 3.0-rc2, the mutex
> acquisition jumps to 18.6% of cpu. This seems to be the main cause of
> the 52% throughput regression.
>
This patch makes the mutex in Tim's workload take a bit less CPU time
(4% down) but it doesn't really fix the regression. When spinning for a
value it's always better to read it first before attempting to write it.
This saves expensive operations on the interconnect.
So it's not really a fix for this, but may be a slight improvement for
other workloads.
-Andi
>From 34d4c1e579b3dfbc9a01967185835f5829bd52f0 Mon Sep 17 00:00:00 2001
From: Andi Kleen <ak@...ux.intel.com>
Date: Tue, 14 Jun 2011 16:27:54 -0700
Subject: [PATCH] mutex: while spinning read count before attempting cmpxchg
Under heavy contention it's better to read first before trying
to do an atomic operation on the interconnect.
This gives a few percent improvement for the mutex CPU time
under heavy contention and likely saves some power too.
Signed-off-by: Andi Kleen <ak@...ux.intel.com>
---
kernel/mutex.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/kernel/mutex.c b/kernel/mutex.c
index d607ed5..1abffa9 100644
--- a/kernel/mutex.c
+++ b/kernel/mutex.c
@@ -170,7 +170,8 @@ __mutex_lock_common(struct mutex *lock, long state, unsigned int subclass,
if (owner && !mutex_spin_on_owner(lock, owner))
break;
- if (atomic_cmpxchg(&lock->count, 1, 0) == 1) {
+ if (atomic_read(&lock->count) == 1 &&
+ atomic_cmpxchg(&lock->count, 1, 0) == 1) {
lock_acquired(&lock->dep_map, ip);
mutex_set_owner(lock);
preempt_enable();
--
1.7.4.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists