[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241209133309.794439ca@mordecai.tesarici.cz>
Date: Mon, 9 Dec 2024 13:33:09 +0100
From: Petr Tesarik <ptesarik@...e.com>
To: Valentin Schneider <vschneid@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Dave Hansen
<dave.hansen@...el.com>, linux-kernel@...r.kernel.org,
linux-doc@...r.kernel.org, kvm@...r.kernel.org, linux-mm@...ck.org,
bpf@...r.kernel.org, x86@...nel.org, rcu@...r.kernel.org,
linux-kselftest@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
Masami Hiramatsu <mhiramat@...nel.org>, Jonathan Corbet <corbet@....net>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
"H. Peter Anvin" <hpa@...or.com>, Paolo Bonzini <pbonzini@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>, Vitaly Kuznetsov <vkuznets@...hat.com>,
Andy Lutomirski <luto@...nel.org>, Frederic Weisbecker
<frederic@...nel.org>, "Paul E. McKenney" <paulmck@...nel.org>, Neeraj
Upadhyay <quic_neeraju@...cinc.com>, Joel Fernandes
<joel@...lfernandes.org>, Josh Triplett <josh@...htriplett.org>, Boqun Feng
<boqun.feng@...il.com>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, Uladzislau Rezki
<urezki@...il.com>, Christoph Hellwig <hch@...radead.org>, Lorenzo Stoakes
<lstoakes@...il.com>, Josh Poimboeuf <jpoimboe@...nel.org>, Jason Baron
<jbaron@...mai.com>, Kees Cook <keescook@...omium.org>, Sami Tolvanen
<samitolvanen@...gle.com>, Ard Biesheuvel <ardb@...nel.org>, Nicholas
Piggin <npiggin@...il.com>, Juerg Haefliger
<juerg.haefliger@...onical.com>, Nicolas Saenz Julienne
<nsaenz@...nel.org>, "Kirill A. Shutemov"
<kirill.shutemov@...ux.intel.com>, Nadav Amit <namit@...are.com>, Dan
Carpenter <error27@...il.com>, Chuang Wang <nashuiliang@...il.com>, Yang
Jihong <yangjihong1@...wei.com>, Petr Mladek <pmladek@...e.com>, "Jason A.
Donenfeld" <Jason@...c4.com>, Song Liu <song@...nel.org>, Julian Pidancet
<julian.pidancet@...cle.com>, Tom Lendacky <thomas.lendacky@....com>,
Dionna Glaze <dionnaglaze@...gle.com>, Thomas Weißschuh
<linux@...ssschuh.net>, Juri Lelli <juri.lelli@...hat.com>, Marcelo Tosatti
<mtosatti@...hat.com>, Yair Podemsky <ypodemsk@...hat.com>, Daniel Wagner
<dwagner@...e.de>
Subject: Re: [RFC PATCH v3 13/15] context_tracking,x86: Add infrastructure
to defer kernel TLBI
On Mon, 09 Dec 2024 13:04:43 +0100
Valentin Schneider <vschneid@...hat.com> wrote:
> On 05/12/24 18:31, Petr Tesarik wrote:
> > On Thu, 21 Nov 2024 16:30:16 +0100
> > Peter Zijlstra <peterz@...radead.org> wrote:
> >
> >> On Thu, Nov 21, 2024 at 07:07:44AM -0800, Dave Hansen wrote:
> >> > On 11/21/24 03:12, Peter Zijlstra wrote:
> >> > >> I see e.g. ds_clear_cea() clears PTEs that can have the _PAGE_GLOBAL flag,
> >> > >> and it correctly uses the non-deferrable flush_tlb_kernel_range().
> >> > >
> >> > > I always forget what we use global pages for, dhansen might know, but
> >> > > let me try and have a look.
> >> > >
> >> > > I *think* we only have GLOBAL on kernel text, and that only sometimes.
> >> >
> >> > I think you're remembering how _PAGE_GLOBAL gets used when KPTI is in play.
> >>
> >> Yah, I suppose I am. That was the last time I had a good look at this
> >> stuff :-)
> >>
> >> > Ignoring KPTI for a sec... We use _PAGE_GLOBAL for all kernel mappings.
> >> > Before PCIDs, global mappings let the kernel TLB entries live across CR3
> >> > writes. When PCIDs are in play, global mappings let two different ASIDs
> >> > share TLB entries.
> >>
> >> Hurmph.. bah. That means we do need that horrible CR4 dance :/
> >
> > In general, yes.
> >
> > But I wonder what exactly was the original scenario encountered by
> > Valentin. I mean, if TLB entry invalidations were necessary to sync
> > changes to kernel text after flipping a static branch, then it might be
> > less overhead to make a list of affected pages and call INVLPG on them.
> >
> > AFAIK there is currently no such IPI function for doing that, but if we
> > could add one. If the list of invalidated global pages is reasonably
> > short, of course.
> >
> > Valentin, do you happen to know?
> >
>
> So from my experimentation (hackbench + kernel compilation on housekeeping
> CPUs, dummy while(1) userspace loop on isolated CPUs), the TLB flushes only
> occurred from vunmap() - mainly from all the hackbench threads coming and
> going.
>
> Static branch updates only seem to trigger the sync_core() IPI, at least on
> x86.
Thank you, this is helpful.
So, these allocations span more than tlb_single_page_flush_ceiling
pages (default 33). Is THP enabled? If yes, we could possibly get below
that threshold by improving flushing of huge pages (cf. footnote [1] in
Documentation/arch/x86/tlb.rst).
OTOH even though a series of INVLPG may reduce subsequent TLB misses,
it will not exactly improve latency, so it would go against the main
goal of this whole patch series.
Hmmm... I see, the CR4 dance is the best solution after all. :-|
Petr T
Powered by blists - more mailing lists