lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241209133309.794439ca@mordecai.tesarici.cz>
Date: Mon, 9 Dec 2024 13:33:09 +0100
From: Petr Tesarik <ptesarik@...e.com>
To: Valentin Schneider <vschneid@...hat.com>
Cc: Peter Zijlstra <peterz@...radead.org>, Dave Hansen
 <dave.hansen@...el.com>, linux-kernel@...r.kernel.org,
 linux-doc@...r.kernel.org, kvm@...r.kernel.org, linux-mm@...ck.org,
 bpf@...r.kernel.org, x86@...nel.org, rcu@...r.kernel.org,
 linux-kselftest@...r.kernel.org, Steven Rostedt <rostedt@...dmis.org>,
 Masami Hiramatsu <mhiramat@...nel.org>, Jonathan Corbet <corbet@....net>,
 Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
 Borislav Petkov <bp@...en8.de>, Dave Hansen <dave.hansen@...ux.intel.com>,
 "H. Peter Anvin" <hpa@...or.com>, Paolo Bonzini <pbonzini@...hat.com>,
 Wanpeng Li <wanpengli@...cent.com>, Vitaly Kuznetsov <vkuznets@...hat.com>,
 Andy Lutomirski <luto@...nel.org>, Frederic Weisbecker
 <frederic@...nel.org>, "Paul E. McKenney" <paulmck@...nel.org>, Neeraj
 Upadhyay <quic_neeraju@...cinc.com>, Joel Fernandes
 <joel@...lfernandes.org>, Josh Triplett <josh@...htriplett.org>, Boqun Feng
 <boqun.feng@...il.com>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
 Lai Jiangshan <jiangshanlai@...il.com>, Zqiang <qiang.zhang1211@...il.com>,
 Andrew Morton <akpm@...ux-foundation.org>, Uladzislau Rezki
 <urezki@...il.com>, Christoph Hellwig <hch@...radead.org>, Lorenzo Stoakes
 <lstoakes@...il.com>, Josh Poimboeuf <jpoimboe@...nel.org>, Jason Baron
 <jbaron@...mai.com>, Kees Cook <keescook@...omium.org>, Sami Tolvanen
 <samitolvanen@...gle.com>, Ard Biesheuvel <ardb@...nel.org>, Nicholas
 Piggin <npiggin@...il.com>, Juerg Haefliger
 <juerg.haefliger@...onical.com>, Nicolas Saenz Julienne
 <nsaenz@...nel.org>, "Kirill A. Shutemov"
 <kirill.shutemov@...ux.intel.com>, Nadav Amit <namit@...are.com>, Dan
 Carpenter <error27@...il.com>, Chuang Wang <nashuiliang@...il.com>, Yang
 Jihong <yangjihong1@...wei.com>, Petr Mladek <pmladek@...e.com>, "Jason A.
 Donenfeld" <Jason@...c4.com>, Song Liu <song@...nel.org>, Julian Pidancet
 <julian.pidancet@...cle.com>, Tom Lendacky <thomas.lendacky@....com>,
 Dionna Glaze <dionnaglaze@...gle.com>, Thomas Weißschuh
 <linux@...ssschuh.net>, Juri Lelli <juri.lelli@...hat.com>, Marcelo Tosatti
 <mtosatti@...hat.com>, Yair Podemsky <ypodemsk@...hat.com>, Daniel Wagner
 <dwagner@...e.de>
Subject: Re: [RFC PATCH v3 13/15] context_tracking,x86: Add infrastructure
 to defer kernel TLBI

On Mon, 09 Dec 2024 13:04:43 +0100
Valentin Schneider <vschneid@...hat.com> wrote:

> On 05/12/24 18:31, Petr Tesarik wrote:
> > On Thu, 21 Nov 2024 16:30:16 +0100
> > Peter Zijlstra <peterz@...radead.org> wrote:
> >  
> >> On Thu, Nov 21, 2024 at 07:07:44AM -0800, Dave Hansen wrote:  
> >> > On 11/21/24 03:12, Peter Zijlstra wrote:  
> >> > >> I see e.g. ds_clear_cea() clears PTEs that can have the _PAGE_GLOBAL flag,
> >> > >> and it correctly uses the non-deferrable flush_tlb_kernel_range().  
> >> > >
> >> > > I always forget what we use global pages for, dhansen might know, but
> >> > > let me try and have a look.
> >> > >
> >> > > I *think* we only have GLOBAL on kernel text, and that only sometimes.  
> >> >
> >> > I think you're remembering how _PAGE_GLOBAL gets used when KPTI is in play.  
> >>
> >> Yah, I suppose I am. That was the last time I had a good look at this
> >> stuff :-)
> >>  
> >> > Ignoring KPTI for a sec... We use _PAGE_GLOBAL for all kernel mappings.
> >> > Before PCIDs, global mappings let the kernel TLB entries live across CR3
> >> > writes. When PCIDs are in play, global mappings let two different ASIDs
> >> > share TLB entries.  
> >>
> >> Hurmph.. bah. That means we do need that horrible CR4 dance :/  
> >
> > In general, yes.
> >
> > But I wonder what exactly was the original scenario encountered by
> > Valentin. I mean, if TLB entry invalidations were necessary to sync
> > changes to kernel text after flipping a static branch, then it might be
> > less overhead to make a list of affected pages and call INVLPG on them.
> >
> > AFAIK there is currently no such IPI function for doing that, but if we
> > could add one. If the list of invalidated global pages is reasonably
> > short, of course.
> >
> > Valentin, do you happen to know?
> >  
> 
> So from my experimentation (hackbench + kernel compilation on housekeeping
> CPUs, dummy while(1) userspace loop on isolated CPUs), the TLB flushes only
> occurred from vunmap() - mainly from all the hackbench threads coming and
> going.
> 
> Static branch updates only seem to trigger the sync_core() IPI, at least on
> x86.

Thank you, this is helpful.

So, these allocations span more than tlb_single_page_flush_ceiling
pages (default 33). Is THP enabled? If yes, we could possibly get below
that threshold by improving flushing of huge pages (cf. footnote [1] in
Documentation/arch/x86/tlb.rst).

OTOH even though a series of INVLPG may reduce subsequent TLB misses,
it will not exactly improve latency, so it would go against the main
goal of this whole patch series.

Hmmm... I see, the CR4 dance is the best solution after all. :-|

Petr T

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ