[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131213134304.GB11176@gmail.com>
Date: Fri, 13 Dec 2013 14:43:04 +0100
From: Ingo Molnar <mingo@...nel.org>
To: Alex Shi <alex.shi@...aro.org>
Cc: Mel Gorman <mgorman@...e.de>, H Peter Anvin <hpa@...or.com>,
Linux-X86 <x86@...nel.org>, Linux-MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Fengguang Wu <fengguang.wu@...el.com>
Subject: Re: [PATCH 2/3] x86: mm: Change tlb_flushall_shift for IvyBridge
* Alex Shi <alex.shi@...aro.org> wrote:
> On 12/13/2013 09:02 AM, Alex Shi wrote:
> >> > You have not replied to this concern of mine: if my concern is valid
> >> > then that invalidates much of the current tunings.
> > The benefit from pretend flush range is not unconditional, since invlpg
> > also cost time. And different CPU has different invlpg/flush_all
> > execution time.
>
> TLB refill time is also different on different kind of cpu.
>
> BTW,
> A bewitching idea is till attracting me.
> https://lkml.org/lkml/2012/5/23/148
> Even it was sentenced to death by HPA.
> https://lkml.org/lkml/2012/5/24/143
I don't think it was sentenced to death by HPA. What do the hardware
guys say, is this safe on current CPUs?
If yes then as long as we only activate this optimization for known
models (and turn it off for unknown models) we should be pretty safe,
even if the hw guys (obviously) don't want to promise this
indefinitely for all Intel HT implementations in the future, right?
> That is that just flush one of thread TLB is enough for SMT/HT,
> seems TLB is still shared in core on Intel CPU. This benefit is
> unconditional, and if my memory right, Kbuild testing can improve
> about 1~2% in average level.
Oh, a 1-2% kbuild speedup is absolutely _massive_. Don't even think
about dropping this idea ... it needs to be explored.
Alas, that for_each_cpu() loop is obviously disgusting, these values
should be precalculated into percpu variables and such.
> So could you like to accept some ugly quirks to do this lazy TLB
> flush on known working CPU?
it's not really 'lazy TLB flush' AFAICS but a genuine optimization:
only flush the TLB on the logical CPUs that need it, right? I.e. do
only one flush per pair of siblings.
> Forgive me if it's stupid.
I'd say measurable speedups that are safe are never ever stupid.
And even the range-flush TLB optimization we are talking about here
could still be used IMO, just tone it down a bit and make it less
model dependent.
Thanks,
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists