linux-kernel - Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20161027095950.GH2699@techsingularity.net>
Date:   Thu, 27 Oct 2016 10:59:50 +0100
From:   Mel Gorman <mgorman@...hsingularity.net>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andy Lutomirski <luto@...capital.net>,
        Andreas Gruenbacher <agruenba@...hat.com>,
        Andy Lutomirski <luto@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Bob Peterson <rpeterso@...hat.com>,
        Steven Whitehouse <swhiteho@...hat.com>,
        linux-mm <linux-mm@...ck.org>
Subject: Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

On Thu, Oct 27, 2016 at 11:44:49AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 27, 2016 at 10:07:42AM +0100, Mel Gorman wrote:
> > > Something like so could work I suppose, but then there's a slight
> > > regression in the page_unlock() path, where we now do an unconditional
> > > spinlock; iow. we loose the unlocked waitqueue_active() test.
> > > 
> > 
> > I can't convince myself it's worthwhile. At least, I can't see a penalty
> > of potentially moving one of the two bits to the high word. It's the
> > same cache line and the same op when it matters.
> 
> I'm having trouble connecting these here two paragraphs. Or were you
> replying to something else?
> 
> So the current unlock code does:
> 
>   wake_up_page()
>     if (waitqueue_active())
>       __wake_up() /* takes waitqueue spinlocks here */
> 
> While the new one does:
> 
>   spin_lock(&q->lock);
>   if (waitqueue_active()) {
>     __wake_up_common()
>   }
>   spin_unlock(&q->lock);
> 
> Which is an unconditional atomic op (which go for about ~20 cycles each,
> when uncontended).
> 

Ok, we were thinking about different things but I'm not sure I get your
concern. With your patch, in the uncontended case we check the waiters
bit and if there is no contention, we carry on. In the contended case,
the lock is taken. Given that contention is likely to be due to IO being
completed, I don't think the atomic op on top is going to make that much
of a difference.

About the only hazard I can think of is when unrelated pages hash to the
same queue and so there is an extra op for the "fake contended" case. I
don't think it's worth worrying about given that a false contention and
atomic op might hurt some workload but the common case is avoiding a
lookup.

> > I don't see why it should be NUMA-specific even though with Linus'
> > patch, NUMA is a concern. Even then, you still need a 64BIT check
> > because 32BIT && NUMA is allowed on a number of architectures.
> 
> Oh, I thought we killed 32bit NUMA and didn't check. I can make it
> CONFIG_64BIT and be done with it. s/CONFIG_NUMA/CONFIG_64BIT/ on the
> patch should do :-)
> 

Sounds good.

-- 
Mel Gorman
SUSE Labs