[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20260107151700.c7b9051929548391e92cfb3e@linux-foundation.org>
Date: Wed, 7 Jan 2026 15:17:00 -0800
From: Andrew Morton <akpm@...ux-foundation.org>
To: Andrew Cooper <andrew.cooper3@...rix.com>
Cc: LKML <linux-kernel@...r.kernel.org>, Marco Elver <elver@...gle.com>,
Alexander Potapenko <glider@...gle.com>, Dmitry Vyukov
<dvyukov@...gle.com>, Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar
<mingo@...hat.com>, Borislav Petkov <bp@...en8.de>, Dave Hansen
<dave.hansen@...ux.intel.com>, x86@...nel.org, "H. Peter Anvin"
<hpa@...or.com>, Jann Horn <jannh@...gle.com>, kasan-dev@...glegroups.com
Subject: Re: [PATCH] x86/kfence: Avoid writing L1TF-vulnerable PTEs
On Tue, 6 Jan 2026 18:04:26 +0000 Andrew Cooper <andrew.cooper3@...rix.com> wrote:
> For native, the choice of PTE is fine. There's real memory backing the
> non-present PTE. However, for XenPV, Xen complains:
>
> (XEN) d1 L1TF-vulnerable L1e 8010000018200066 - Shadowing
>
> To explain, some background on XenPV pagetables:
>
> Xen PV guests are control their own pagetables; they choose the new PTE
> value, and use hypercalls to make changes so Xen can audit for safety.
>
> In addition to a regular reference count, Xen also maintains a type
> reference count. e.g. SegDesc (referenced by vGDT/vLDT),
> Writable (referenced with _PAGE_RW) or L{1..4} (referenced by vCR3 or a
> lower pagetable level). This is in order to prevent e.g. a page being
> inserted into the pagetables for which the guest has a writable mapping.
>
> For non-present mappings, all other bits become software accessible, and
> typically contain metadata rather a real frame address. There is nothing
> that a reference count could sensibly be tied to. As such, even if Xen
> could recognise the address as currently safe, nothing would prevent that
> frame from changing owner to another VM in the future.
>
> When Xen detects a PV guest writing a L1TF-PTE, it responds by activating
> shadow paging. This is normally only used for the live phase of
> migration, and comes with a reasonable overhead.
>
> KFENCE only cares about getting #PF to catch wild accesses; it doesn't care
> about the value for non-present mappings. Use a fully inverted PTE, to
> avoid hitting the slow path when running under Xen.
>
> While adjusting the logic, take the opportunity to skip all actions if the
> PTE is already in the right state, half the number PVOps callouts, and skip
> TLB maintenance on a !P -> P transition which benefits non-Xen cases too.
>
> Fixes: 1dc0da6e9ec0 ("x86, kfence: enable KFENCE for x86")
Seems that I sent 1dc0da6e9ec0 upstream so thanks, I'll grab this. If
an x86 person chooses to handle it then I'll drop the mm.git version.
I'll add a cc:stable to the mm.git copy, just to be sure.
> Tested-by: Marco Elver <elver@...gle.com>
> Signed-off-by: Andrew Cooper <andrew.cooper3@...rix.com>
> ---
That "^---$" tells tooling "changelog stops here".
> CC: Alexander Potapenko <glider@...gle.com>
> CC: Marco Elver <elver@...gle.com>
> CC: Dmitry Vyukov <dvyukov@...gle.com>
> CC: Thomas Gleixner <tglx@...utronix.de>
> CC: Ingo Molnar <mingo@...hat.com>
> CC: Borislav Petkov <bp@...en8.de>
> CC: Dave Hansen <dave.hansen@...ux.intel.com>
> CC: x86@...nel.org
> CC: "H. Peter Anvin" <hpa@...or.com>
> CC: Andrew Morton <akpm@...ux-foundation.org>
> CC: Jann Horn <jannh@...gle.com>
> CC: kasan-dev@...glegroups.com
> CC: linux-kernel@...r.kernel.org
>
> v1:
> * First public posting. This went to security@ first just in case, and
> then I got districted with other things ahead of public posting.
> ---
That "^---$" would be better placed above the versioning info.
>
> ...
>
Powered by blists - more mailing lists