lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1621d18-0afc-4cd6-8b34-bce2e3936a9c@intel.com>
Date: Mon, 6 Jan 2025 11:03:24 -0800
From: Dave Hansen <dave.hansen@...el.com>
To: Rik van Riel <riel@...riel.com>, x86@...nel.org
Cc: linux-kernel@...r.kernel.org, kernel-team@...a.com,
 dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
 tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
 akpm@...ux-foundation.org, nadav.amit@...il.com, zhengqi.arch@...edance.com,
 linux-mm@...ck.org
Subject: Re: [PATCH v3 00/12] AMD broadcast TLB invalidation

A couple of high level things we need to address:

First, I'm OK calling this approach "broadcast TLB invalidation". But I
don't think the ASIDs should be called "broadcast ASIDs". I'd much
rather that they are called something which makes it clear that they are
from a different namespace than the existing ASIDs.

After this series there will be three classes:

 0: Special ASID used for the kernel, basically
 1->TLB_NR_DYN_ASIDS: Allocated from private, per-cpu space. Meaningless
		      when compared between CPUs.
 >TLB_NR_DYN_ASIDS:   Allocated from shared, kernel-wide space. All CPUs
		      share this space and must all agree on what the
		      values mean.

The fact that the "shared" ones are system-wide obviously allows INVLPGB
to be used. The hardware feature also obviously "broadcasts" things more
than plain old INVLPG did. But I don't think that makes the ASIDs
"broadcast" ASIDs.

It's much more important to know that they are shared across the system
instead of per-cpu than the fact that the deep implementation manages
them with an instruction that is "broadcast" by hardware.

So can we call them "global", "shared" or "system" ASIDs, please?

Second, the TLB_NR_DYN_ASIDS was picked because it's roughly the number
of distinct PCIDs that the CPU can keep in the TLB at once (at least on
Intel). Let's say a CPU has 6 mm's in the per-cpu ASID space and another
6 in the shared/broadcast space. At that point, PCIDs might not be doing
much good because the TLB can't store entries for 12 PCIDs.

Is there any comprehension in this series? Should we be indexing
cpu_tlbstate.ctxs[] by a *context* number rather than by the ASID that
it's running as?

Last, I'm not 100% convinced we want to do this whole thing. The
will-it-scale numbers are nice. But given the complexity of this, I
think we need some actual, real end users to stand up and say exactly
how this is important in *PRODUCTION* to them.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ