lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2d1168630e5c25d0cd28f0de3104ada9ceda168f.camel@surriel.com>
Date: Thu, 20 Feb 2025 10:25:48 -0500
From: Rik van Riel <riel@...riel.com>
To: Dave Hansen <dave.hansen@...el.com>, x86@...nel.org
Cc: linux-kernel@...r.kernel.org, bp@...en8.de, peterz@...radead.org, 
	dave.hansen@...ux.intel.com, zhengqi.arch@...edance.com,
 nadav.amit@...il.com, 	thomas.lendacky@....com, kernel-team@...a.com,
 linux-mm@...ck.org, 	akpm@...ux-foundation.org, jackmanb@...gle.com,
 jannh@...gle.com, 	mhklinux@...look.com, andrew.cooper3@...rix.com, Manali
 Shukla	 <Manali.Shukla@....com>
Subject: Re: [PATCH v11 09/12] x86/mm: enable broadcast TLB invalidation for
 multi-threaded processes

On Fri, 2025-02-14 at 11:53 -0800, Dave Hansen wrote:
> On 2/13/25 08:14, Rik van Riel wrote:
> 
> 

> > +#ifdef CONFIG_X86_BROADCAST_TLB_FLUSH
> 
> How cleanly could we throw this hunk in a new file? I always dislike
> big
> #ifdefs like this. They seem like magnets for causing weird compile
> problems.

It looks like if we compile the X86_FEATURE_INVLPGB
out through arch/x86/include/asm/disabled-features.h,
the compiler can be smart enough to elide the no
longer necessary code ... but only if we have these
functions in the same compilation unit as their
callers.

That means we should be able to get rid of the ifdef, 
if we keep these functions in arch/x86/mm/tlb.c

> 
> > +/*
> > + * Logic for broadcast TLB invalidation.
> > + */
> > +static DEFINE_RAW_SPINLOCK(global_asid_lock);
> 
> The global lock definitely needs some discussion in the changelog.
> 
> > +static u16 last_global_asid = MAX_ASID_AVAILABLE;
> > +static DECLARE_BITMAP(global_asid_used, MAX_ASID_AVAILABLE) = { 0
> > };
> > +static DECLARE_BITMAP(global_asid_freed, MAX_ASID_AVAILABLE) = { 0
> > };
> 
> Isn't the initialization to all 0's superfluous for a global
> variable?
> 
I'll remove that, and will add documentation for
the spinlock.


> > +static int global_asid_available = MAX_ASID_AVAILABLE -
> > TLB_NR_DYN_ASIDS - 1;
> > +
> > +static void reset_global_asid_space(void)
> > +{
> > +	lockdep_assert_held(&global_asid_lock);
> > +
> > +	/*
> > +	 * A global TLB flush guarantees that any stale entries
> > from
> > +	 * previously freed global ASIDs get flushed from the TLB
> > +	 * everywhere, making these global ASIDs safe to reuse.
> > +	 */
> > +	invlpgb_flush_all_nonglobals();
> 
> Ugh, my suggestion to use the term "global ASID" really doesn't work
> here, does it?
> 
> Also, isn't a invlpgb_flush_all_nonglobals() _relatively_ slow? It
> has
> to go out and talk over the fabric to every CPU, right? This is also
> holding a global lock.
> 
> That's seems worrisome.

This only happens on ASID rollover, when we have
reached the end of global ASID space, and are
about to restart the search from the beginning.

We do not do a flush at every allocation, but
only once every few thousand global ASID allocations.

> 
> > +static bool needs_global_asid_reload(struct mm_struct *next, u16
> > prev_asid)
> > +{
> > +	u16 global_asid = mm_global_asid(next);
> > +
> > +	/* Process is transitioning to a global ASID */
> > +	if (global_asid && prev_asid != global_asid)
> > +		return true;
> > +
> > +	/* Transition from global->local ASID does not currently
> > happen. */
> > +	if (!global_asid && is_global_asid(prev_asid))
> > +		return true;
> > +
> > +	return false;
> > +}
> I'm going to throw in the towel at this point. This patch needs to
> get
> broken up. It's more at once than my poor little brain can handle.
> 
> The _least_ it can do is introduce the stub functions and injection
> into
> existing code changes, first. Then, in a second patch, introduce the
> real implementation.
> 
> I also suspect that a big chunk of the ASID allocator could be broken
> out and introduced separately.
> 
> Another example is broadcast_tlb_flush(). To reduce complexity in
> _this_
> patch, it could do something suboptimal like always do a
> invlpgb_flush_all_nonglobals() regardless of the kind of flush it
> gets.
> Then, in a later patch, you could optimize it.
> 
With the config option and ifdefs (mostly) gone, I
think the split would probably have to be in two
patches:
1) Modify existing code to call non-functional
   stub functions.
2) Modify the stub functions to then do something.

I'm not sure quite how much this will help with review,
since to review the second patch you'll have to go back
to the first patch, but I might as well try...

-- 
All Rights Reversed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ