linux-kernel - Re: [PATCH 2/3] x86: mm: Change tlb_flushall

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131213134304.GB11176@gmail.com>
Date:	Fri, 13 Dec 2013 14:43:04 +0100
From:	Ingo Molnar <mingo@...nel.org>
To:	Alex Shi <alex.shi@...aro.org>
Cc:	Mel Gorman <mgorman@...e.de>, H Peter Anvin <hpa@...or.com>,
	Linux-X86 <x86@...nel.org>, Linux-MM <linux-mm@...ck.org>,
	LKML <linux-kernel@...r.kernel.org>,
	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <a.p.zijlstra@...llo.nl>,
	Fengguang Wu <fengguang.wu@...el.com>
Subject: Re: [PATCH 2/3] x86: mm: Change tlb_flushall_shift for IvyBridge

* Alex Shi <alex.shi@...aro.org> wrote:

> On 12/13/2013 09:02 AM, Alex Shi wrote:
> >> > You have not replied to this concern of mine: if my concern is valid 
> >> > then that invalidates much of the current tunings.
> > The benefit from pretend flush range is not unconditional, since invlpg
> > also cost time. And different CPU has different invlpg/flush_all
> > execution time. 
> 
> TLB refill time is also different on different kind of cpu.
> 
> BTW,
> A bewitching idea is till attracting me.
> https://lkml.org/lkml/2012/5/23/148
> Even it was sentenced to death by HPA.
> https://lkml.org/lkml/2012/5/24/143

I don't think it was sentenced to death by HPA. What do the hardware 
guys say, is this safe on current CPUs?

If yes then as long as we only activate this optimization for known 
models (and turn it off for unknown models) we should be pretty safe, 
even if the hw guys (obviously) don't want to promise this 
indefinitely for all Intel HT implementations in the future, right?

> That is that just flush one of thread TLB is enough for SMT/HT, 
> seems TLB is still shared in core on Intel CPU. This benefit is 
> unconditional, and if my memory right, Kbuild testing can improve 
> about 1~2% in average level.

Oh, a 1-2% kbuild speedup is absolutely _massive_. Don't even think 
about dropping this idea ... it needs to be explored.

Alas, that for_each_cpu() loop is obviously disgusting, these values 
should be precalculated into percpu variables and such.

> So could you like to accept some ugly quirks to do this lazy TLB 
> flush on known working CPU?

it's not really 'lazy TLB flush' AFAICS but a genuine optimization: 
only flush the TLB on the logical CPUs that need it, right? I.e. do 
only one flush per pair of siblings.

> Forgive me if it's stupid.

I'd say measurable speedups that are safe are never ever stupid.

And even the range-flush TLB optimization we are talking about here 
could still be used IMO, just tone it down a bit and make it less 
model dependent.

Thanks,

	Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/