linux-kernel - Re: [RFC][PATCH] make global bitlock waitqueues per-node

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161220122615.1f4b494d@roar.ozlabs.ibm.com>
Date:   Tue, 20 Dec 2016 12:26:15 +1000
From:   Nicholas Piggin <npiggin@...il.com>
To:     Dave Hansen <dave.hansen@...ux.intel.com>
Cc:     linux-kernel@...r.kernel.org, linux-mm@...ck.org,
        agruenba@...hat.com, rpeterso@...hat.com,
        mgorman@...hsingularity.net, peterz@...radead.org, luto@...nel.org,
        swhiteho@...hat.com, torvalds@...ux-foundation.org
Subject: Re: [RFC][PATCH] make global bitlock waitqueues per-node

On Mon, 19 Dec 2016 14:58:26 -0800
Dave Hansen <dave.hansen@...ux.intel.com> wrote:

> I saw a 4.8->4.9 regression (details below) that I attributed to:
> 
> 	9dcb8b685f mm: remove per-zone hashtable of bitlock waitqueues
> 
> That commit took the bitlock waitqueues from being dynamically-allocated
> per-zone to being statically allocated and global.  As suggested by
> Linus, this makes them per-node, but keeps them statically-allocated.
> 
> It leaves us with more waitqueues than the global approach, inherently
> scales it up as we gain nodes, and avoids generating code for
> page_zone() which was evidently quite ugly.  The patch is pretty darn
> tiny too.
> 
> This turns what was a ~40% 4.8->4.9 regression into a 17% gain over
> what on 4.8 did.  That gain is a _bit_ surprising, but not entirely
> unexpected since we now get much simpler code from no page_zone() and a
> fixed-size array for which we don't have to follow a pointer (and get to
> do power-of-2 math).

I'll have to respin the PageWaiters patch and resend it. There were
just a couple of small issues picked up in review. I've just got side
tracked with getting a few other things done and haven't had time to
benchmark it properly.

I'd still like to see what per-node waitqueues does on top of that. If
it's significant for realistic workloads then it could be done for the
page waitqueues as Linus said.

Thanks,
Nick