[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.02.1312021200100.30673@ionos.tec.linutronix.de>
Date: Mon, 2 Dec 2013 12:01:21 +0100 (CET)
From: Thomas Gleixner <tglx@...utronix.de>
To: Davidlohr Bueso <davidlohr@...com>
cc: Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Jason Low <jason.low2@...com>, Ingo Molnar <mingo@...nel.org>,
Darren Hart <dvhart@...ux.intel.com>,
Mike Galbraith <efault@....de>, Jeff Mahoney <jeffm@...e.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Scott Norton <scott.norton@...com>,
Tom Vaden <tom.vaden@...com>,
Aswin Chandramouleeswaran <aswin@...com>,
Waiman Long <Waiman.Long@...com>,
"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>
Subject: Re: [RFC patch 0/5] futex: Allow lockless empty check of hashbucket
plist in futex_wake()
On Sat, 30 Nov 2013, Davidlohr Bueso wrote:
> On Thu, 2013-11-28 at 12:59 +0100, Peter Zijlstra wrote:
> > On Wed, Nov 27, 2013 at 11:44:38PM -0800, Davidlohr Bueso wrote:
> > > How about both enlarging the table _and_ aligning the buckets? As you
> > > know, increasing the size of the table also benefits (particularly in
> > > larger systems) in having more spinlocks. So we reduce the amount of
> > > collisions and alleviate contention on the hb->lock. Btw, do you have
> > > any particular concerns about the larger hash table patch?
> >
> > My only concern was the amount of #ifdef.
> >
> > Wouldn't something like the below also work?
>
> Below are the results for a workload that stresses the uaddr hashing for
> large amounts of futexes (just make waits fail the uval check, so no
> list handing overhead) on an 80 core, 1Tb NUMA system.
>
> +---------+--------------------+------------------------+-----------------------+-------------------------------+
> | threads | baseline (ops/sec) | aligned-only (ops/sec) | large table (ops/sec) | large table+aligned (ops/sec) |
> +---------+--------------------+------------------------+-----------------------+-------------------------------+
> | 512 | 32426 | 50531 (+55.8%) | 255274 (+687.2%) | 292553 (+802.2%) |
> | 256 | 65360 | 99588 (+52.3%) | 443563 (+578.6%) | 508088 (+677.3%) |
> | 128 | 125635 | 200075 (+59.2%) | 742613 (+491.1%) | 835452 (+564.9%) |
> | 80 | 193559 | 323425 (+67.1%) | 1028147 (+431.1%) | 1130304 (+483.9%) |
> | 64 | 247667 | 443740 (+79.1%) | 997300 (+302.6%) | 1145494 (+362.5%) |
> | 32 | 628412 | 721401 (+14.7%) | 965996 (+53.7%) | 1122115 (+78.5%) |
> +---------+--------------------+------------------------+-----------------------+-------------------------------+
>
> Baseline of course sucks compared to any other performance boost, and we
> get the best throughput when applying both optimizations, no surprise.
> We do particularly well for more than 32 threads, and the 'aligned-only'
> column nicely exemplifies the benefits of SMP aligning the buckets
> without considering the reduction in collisions.
Right. So yes, we want both. And I fully agree with Peters dynamic
allocation approach.
Thanks,
tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists