linux-kernel - Re: post-3.18 performance regression in TLB flushing code

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:	Wed, 17 Dec 2014 10:08:11 +0000
From:	Will Deacon <will.deacon@....com>
To:	Dave Hansen <dave@...1.net>
Cc:	Benjamin Herrenschmidt <benh@...nel.crashing.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Russell King - ARM Linux <linux@....linux.org.uk>,
	Michal Simek <monstr@...str.eu>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Linux-MM <linux-mm@...ck.org>
Subject: Re: post-3.18 performance regression in TLB flushing code

Hi Dave,

Thanks for reporting this.

On Tue, Dec 16, 2014 at 09:36:56PM +0000, Dave Hansen wrote:
> I'm running the 'brk1' test from will-it-scale:
> 
> > https://github.com/antonblanchard/will-it-scale/blob/master/tests/brk1.c
> 
> on a 8-socket/160-thread system.  It's seeing about a 6% drop in
> performance (263M -> 247M ops/sec at 80-threads) from this commit:

This is x86, right?

> 	commit fb7332a9fedfd62b1ba6530c86f39f0fa38afd49
> 	Author: Will Deacon <will.deacon@....com>
> 	Date:   Wed Oct 29 10:03:09 2014 +0000
> 
> 	 mmu_gather: move minimal range calculations into generic code
> 
> tlb_finish_mmu() goes up about 9x in the profiles (~0.4%->3.6%) and
> tlb_flush_mmu_free() takes about 3.1% of CPU time with the patch
> applied, but does not show up at all on the commit before.

Ouch...

> This isn't a major regression, but it is rather unfortunate for a patch
> that is apparently a code cleanup.  It also _looks_ to show up even when
> things are single-threaded, although I haven't looked at it in detail.

Ok, so there are two parts to this patch:

  (1) Fixing an issue where the arch code (arm64 in my case) wanted to
      adjust the range behind the back of the core code, which resulted
      in negative ranges being flushed

  (2) A cleanup removing the need_flush field, since we can now rely
      on tlb->end != 0 to indicate that a flush is needed

> I suspect the tlb->need_flush logic was serving some role that the
> modified code isn't capturing like in this hunk:
> 
> >  void tlb_flush_mmu(struct mmu_gather *tlb)
> >  {
> > -       if (!tlb->need_flush)
> > -               return;
> >         tlb_flush_mmu_tlbonly(tlb);
> >         tlb_flush_mmu_free(tlb);
> >  }
> 
> tlb_flush_mmu_tlbonly() has tlb->end check (which replaces the
> ->need_flush logic), but tlb_flush_mmu_free() does not.

Yes, I thought tlb_flush_mmu_free wouldn't do anything if we hadn't batched,
but actually free_pages_and_swap_cache does have work outside of the loop.

> If we add a !tlb->end (patch attached) to tlb_flush_mmu(), that gets us
> back up to ~258M ops/sec, but that's still ~2% down from where we started.

I think there are a couple of things you could try to see if that 2% comes
back:

  * Revert the patch and try the one here [1] instead (which only does part
    (1) of the above).

-- or --

  * Instead of adding the tlb->end check to tlb_flush_mmu, add it to
    tlb_flush_mmu_free

Could you give that a go, please?

Cheers,

Will

[1] http://www.spinics.net/lists/kernel/msg1855260.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/