lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ae11e5ac6b278de9ad2ad2badbd5f010543d934.camel@surriel.com>
Date: Sun, 22 Dec 2024 10:37:01 -0500
From: Rik van Riel <riel@...riel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com, 
	dave.hansen@...ux.intel.com, luto@...nel.org, tglx@...utronix.de,
 mingo@...hat.com, 	bp@...en8.de, hpa@...or.com, akpm@...ux-foundation.org
Subject: Re: [PATCH 09/10] x86/mm: enable AMD translation cache extensions

On Sun, 2024-12-22 at 12:38 +0100, Peter Zijlstra wrote:
> On Sat, Dec 21, 2024 at 11:06:41PM -0500, Rik van Riel wrote:
> > With AMD TCE (translation cache extensions) only the intermediate
> > mappings
> 
> Only the leave mapings, as written this all don't make sense,

Check out page 513 of the AMD manual:

https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf

"Translation Cache Extension (TCE) Bit. Bit 15, read/write. 

Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID
instructions operate on TLB entries. When this bit is 0, these
instructions remove the target PTE from the TLB as well as all 
upper-level table entries that are cached in the TLB, whether or 
not they are associated with the target PTE. When this bit is set,
these instructions will remove the target PTE and only those 
upper-level entries that lead to the target PTE in the page table
hierarchy, leaving unrelated upper-level entries intact. This may
provide a performance benefit.

Page table management software must be written in a way that takes 
this behavior into account. Software that was written for a 
processor that does not cache upper-level table entries may result 
in stale entries being incorrectly used for translations when TCE 
is enabled. Software that is compatible with TCE mode will operate
in either mode.

For software using INVLPGB to broadcast TLB invalidations, the
invalidations are controlled by the EFER.TCE value on the processor
executing the INVLPGB instruction.

Before setting TCE, system software should verify that this feature
is supported by examining the feature flag CPUID Fn8000_0001_ECX[TCE].
See Section 3.3 “Processor Feature Identification,” on
page 71 for information on using the CPUID instruction"


This suggests that:
1) TCE does control the "don't make sense" behavior :)

2) Wait, does EFER.TCE need to be set on every CPU
   in the system?  Could a system run with TCE set
   on some CPUs, and cleared on another?!

It would be nice if somebody from AMD could chime in on
the latter, and ensure that this patch actually does
the right thing :)

I'll incorporate all the other feedback from you and
Borislav into the next version of the patch series!
Thank you for the feedback.

-- 
All Rights Reversed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ