lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241224182518.GB17252@noisy.programming.kicks-ass.net>
Date: Tue, 24 Dec 2024 19:25:18 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Rik van Riel <riel@...riel.com>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
	dave.hansen@...ux.intel.com, luto@...nel.org, tglx@...utronix.de,
	mingo@...hat.com, bp@...en8.de, hpa@...or.com,
	akpm@...ux-foundation.org
Subject: Re: [PATCH 09/10] x86/mm: enable AMD translation cache extensions

On Sun, Dec 22, 2024 at 10:37:01AM -0500, Rik van Riel wrote:
> On Sun, 2024-12-22 at 12:38 +0100, Peter Zijlstra wrote:
> > On Sat, Dec 21, 2024 at 11:06:41PM -0500, Rik van Riel wrote:
> > > With AMD TCE (translation cache extensions) only the intermediate
> > > mappings
> > 
> > Only the leave mapings, as written this all don't make sense,
> 
> Check out page 513 of the AMD manual:
> 
> https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf
> 
> "Translation Cache Extension (TCE) Bit. Bit 15, read/write. 
> 
> Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID
> instructions operate on TLB entries. When this bit is 0, these
> instructions remove the target PTE from the TLB as well as all 
> upper-level table entries that are cached in the TLB, whether or 
> not they are associated with the target PTE. When this bit is set,
> these instructions will remove the target PTE and only those 
> upper-level entries that lead to the target PTE in the page table
> hierarchy, leaving unrelated upper-level entries intact. This may
> provide a performance benefit.
> 
> Page table management software must be written in a way that takes 
> this behavior into account. Software that was written for a 
> processor that does not cache upper-level table entries may result 
> in stale entries being incorrectly used for translations when TCE 
> is enabled. Software that is compatible with TCE mode will operate
> in either mode.
> 
> For software using INVLPGB to broadcast TLB invalidations, the
> invalidations are controlled by the EFER.TCE value on the processor
> executing the INVLPGB instruction.
> 
> Before setting TCE, system software should verify that this feature
> is supported by examining the feature flag CPUID Fn8000_0001_ECX[TCE].
> See Section 3.3 “Processor Feature Identification,” on
> page 71 for information on using the CPUID instruction"

So that makes a ton more sense.

> 
> This suggests that:
> 1) TCE does control the "don't make sense" behavior :)

Well, you wrote:

> > With AMD TCE (translation cache extensions) only the intermediate mappings
> > that cover the address range zapped by INVLPG / INVLPGB get invalidated,
> > rather than all intermediate mappings getting zapped at every TLB invalidation.

And I read that like it would zap only the intermediate mappings rather
than the intermediate mappings.

Reading it a wee bit more carefully, I see it's not quite as bad, but
still not very clear.

> 2) Wait, does EFER.TCE need to be set on every CPU
>    in the system?  Could a system run with TCE set
>    on some CPUs, and cleared on another?!

I would imagine it can; I don't think they would recommend anybody do
this though.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ