[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1385453551.12603.16.camel@buesod1.americas.hpqcorp.net>
Date: Tue, 26 Nov 2013 00:12:31 -0800
From: Davidlohr Bueso <davidlohr@...com>
To: Thomas Gleixner <tglx@...utronix.de>
Cc: LKML <linux-kernel@...r.kernel.org>, Jason Low <jason.low2@...com>,
Ingo Molnar <mingo@...nel.org>,
Darren Hart <dvhart@...ux.intel.com>,
Peter Zijlstra <peterz@...radead.org>,
Mike Galbraith <efault@....de>, Jeff Mahoney <jeffm@...e.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Scott Norton <scott.norton@...com>,
Tom Vaden <tom.vaden@...com>,
Aswin Chandramouleeswaran <aswin@...com>,
Waiman Long <Waiman.Long@...com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [RFC patch 0/5] futex: Allow lockless empty check of hashbucket
plist in futex_wake()
On Mon, 2013-11-25 at 20:58 +0000, Thomas Gleixner wrote:
> The patch set from Davidlohr [1] tried to attempt the same via an
> atomic counter of waiters in a hash bucket. The atomic counter access
> provided enough serialization for x86 so that a failure is not
> observable in testing, but does not provide any guarantees.
>
> The same can be achieved with a smp_mb() pair including proper
> guarantees for all architectures.
I am becoming hesitant about this approach. The following are some
results, from my quad-core laptop, measuring the latency of nthread
wakeups (1 at a time). In addition, failed wait calls never occur -- so
we don't end up including the (otherwise minimal) overhead of the list
queue+dequeue, only measuring the smp_mb() usage when !empty list never
occurs.
+---------+--------------------+--------+-------------------+--------+----------+
| threads | baseline time (ms) | stddev | patched time (ms) | stddev | overhead |
+---------+--------------------+--------+-------------------+--------+----------+
| 512 | 4.2410 | 0.9762 | 12.3660 | 5.1020 | +191.58% |
| 256 | 2.7750 | 0.3997 | 7.0220 | 2.9436 | +153.04% |
| 128 | 1.4910 | 0.4188 | 3.7430 | 0.8223 | +151.03% |
| 64 | 0.8970 | 0.3455 | 2.5570 | 0.3710 | +185.06% |
| 32 | 0.3620 | 0.2242 | 1.1300 | 0.4716 | +212.15% |
+---------+--------------------+--------+-------------------+--------+----------+
While the variation is quite a bit in the patched version for higher
nthreads, the overhead is significant in all cases. Now, this is a very
specific program and far from what occurs in the real world, but I
believe it's good data to have to make a future decision about this kind
of approach.
Thanks,
Davidlohr
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists