[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5ae11e5ac6b278de9ad2ad2badbd5f010543d934.camel@surriel.com>
Date: Sun, 22 Dec 2024 10:37:01 -0500
From: Rik van Riel <riel@...riel.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: x86@...nel.org, linux-kernel@...r.kernel.org, kernel-team@...a.com,
dave.hansen@...ux.intel.com, luto@...nel.org, tglx@...utronix.de,
mingo@...hat.com, bp@...en8.de, hpa@...or.com, akpm@...ux-foundation.org
Subject: Re: [PATCH 09/10] x86/mm: enable AMD translation cache extensions
On Sun, 2024-12-22 at 12:38 +0100, Peter Zijlstra wrote:
> On Sat, Dec 21, 2024 at 11:06:41PM -0500, Rik van Riel wrote:
> > With AMD TCE (translation cache extensions) only the intermediate
> > mappings
>
> Only the leave mapings, as written this all don't make sense,
Check out page 513 of the AMD manual:
https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf
"Translation Cache Extension (TCE) Bit. Bit 15, read/write.
Setting this bit to 1 changes how the INVLPG, INVLPGB, and INVPCID
instructions operate on TLB entries. When this bit is 0, these
instructions remove the target PTE from the TLB as well as all
upper-level table entries that are cached in the TLB, whether or
not they are associated with the target PTE. When this bit is set,
these instructions will remove the target PTE and only those
upper-level entries that lead to the target PTE in the page table
hierarchy, leaving unrelated upper-level entries intact. This may
provide a performance benefit.
Page table management software must be written in a way that takes
this behavior into account. Software that was written for a
processor that does not cache upper-level table entries may result
in stale entries being incorrectly used for translations when TCE
is enabled. Software that is compatible with TCE mode will operate
in either mode.
For software using INVLPGB to broadcast TLB invalidations, the
invalidations are controlled by the EFER.TCE value on the processor
executing the INVLPGB instruction.
Before setting TCE, system software should verify that this feature
is supported by examining the feature flag CPUID Fn8000_0001_ECX[TCE].
See Section 3.3 “Processor Feature Identification,” on
page 71 for information on using the CPUID instruction"
This suggests that:
1) TCE does control the "don't make sense" behavior :)
2) Wait, does EFER.TCE need to be set on every CPU
in the system? Could a system run with TCE set
on some CPUs, and cleared on another?!
It would be nice if somebody from AMD could chime in on
the latter, and ensure that this patch actually does
the right thing :)
I'll incorporate all the other feedback from you and
Borislav into the next version of the patch series!
Thank you for the feedback.
--
All Rights Reversed.
Powered by blists - more mailing lists