[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250304115207.GFZ8bpZ83u7L9x43Rq@fat_crate.local>
Date: Tue, 4 Mar 2025 12:52:07 +0100
From: Borislav Petkov <bp@...en8.de>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Rik van Riel <riel@...riel.com>, x86@...nel.org,
linux-kernel@...r.kernel.org, peterz@...radead.org,
dave.hansen@...ux.intel.com, zhengqi.arch@...edance.com,
nadav.amit@...il.com, thomas.lendacky@....com, kernel-team@...a.com,
linux-mm@...ck.org, akpm@...ux-foundation.org, jackmanb@...gle.com,
jannh@...gle.com, mhklinux@...look.com, andrew.cooper3@...rix.com,
Manali.Shukla@....com, mingo@...nel.org
Subject: Re: [PATCH v14 11/13] x86/mm: do targeted broadcast flushing from
tlbbatch code
On Mon, Mar 03, 2025 at 01:47:42PM -0800, Dave Hansen wrote:
> > +static inline void invlpgb_flush_addr_nosync(unsigned long addr, u16 nr)
> > +{
> > + __invlpgb_flush_addr_nosync(addr, nr);
> > + if (!cpu_need_tlbsync())
> > + cpu_set_tlbsync(true);
> > +}
>
> One thought on these:
>
> Instead of having three functions:
>
> 1. A raw __invlpgb_*_nosync()
> 2. A wrapper invlpgb_*_nosync() that flips cpu_set_tlbsync()
> 3. A wrapper invlpgb_*()
>
> Could we get away with just two? For instance, what if we had *ALL*
> __invlpgb()'s do cpu_set_tlbsync()? Then we'd universally call tlbsync().
>
> static inline void invlpgb_flush_all_nonglobals(void)
> {
> guard(preempt)();
> __invlpgb(0, 0, 0, 1, NO_STRIDE, INVLPGB_MODE_ALL_NONGLOBALS);
> tlbsync();
> }
>
> Then we wouldn't need any of those three new wrappers. The only downside
> is that we'd end up with paths that logically do:
>
> __invlpgb()
> cpu_set_tlbsync(true);
> if (cpu_need_tlbsync()) { // always true
> __tlbsync();
> cpu_set_tlbsync(true);
> }
>
> In other words, a possibly superfluous set/check/clear of the
> "need_tlbsync" state. But I'd expect that to be a pittance compared to
> the actual cost of INVLPGB/TLBSYNC.
Lemme see whether I can grasp properly what you mean:
What you really want is for the _nosync() variants to set need_tlbsync, right?
Because looking at all the call sites which do set tlbsync, the flow is this:
__invlpgb_flush_user_nr_nosync(pcid, addr, nr, pmd_stride, freed_tables);
if (!cpu_need_tlbsync())
cpu_set_tlbsync(true);
So we should move that
cpu_set_tlbsync(true);
into the __invlpgb_*_nosync() variant.
And in order to make it even simpler, we should drop the testing too:
IOW, this:
/* Flush all mappings for a given PCID, not including globals. */
static inline void __invlpgb_flush_single_pcid_nosync(unsigned long pcid)
{
__invlpgb(0, pcid, 0, 1, 0, INVLPGB_PCID);
cpu_set_tlbsync(true);
}
Right?
> > static void broadcast_tlb_flush(struct flush_tlb_info *info)
> > {
> > bool pmd = info->stride_shift == PMD_SHIFT;
> > @@ -790,6 +821,8 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
> > if (IS_ENABLED(CONFIG_PROVE_LOCKING))
> > WARN_ON_ONCE(!irqs_disabled());
> >
> > + tlbsync();
>
> This one is in dire need of comments.
Maybe this:
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 08672350536f..b97249ffff1f 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -822,6 +822,9 @@ void switch_mm_irqs_off(struct mm_struct *unused, struct mm_struct *next,
if (IS_ENABLED(CONFIG_PROVE_LOCKING))
WARN_ON_ONCE(!irqs_disabled());
+ /*
+ * Finish any remote TLB flushes pending from this CPU:
+ */
tlbsync();
/*
> Ditto, *especially* if this hits the init_mm state. There really
> shouldn't be any deferred flushes for the init_mm.
Basically what you said but as a comment. :-P
Merged in the rest.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists