[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.LSU.2.11.2103250929110.10977@eggly.anvils>
Date: Thu, 25 Mar 2021 09:33:18 -0700 (PDT)
From: Hugh Dickins <hughd@...gle.com>
To: Borislav Petkov <bp@...en8.de>
cc: Hugh Dickins <hughd@...gle.com>, Babu Moger <babu.moger@....com>,
Paolo Bonzini <pbonzini@...hat.com>,
Jim Mattson <jmattson@...gle.com>,
Vitaly Kuznetsov <vkuznets@...hat.com>,
Wanpeng Li <wanpengli@...cent.com>,
kvm list <kvm@...r.kernel.org>, Joerg Roedel <joro@...tes.org>,
the arch/x86 maintainers <x86@...nel.org>,
LKML <linux-kernel@...r.kernel.org>,
Ingo Molnar <mingo@...hat.com>,
"H . Peter Anvin" <hpa@...or.com>,
Thomas Gleixner <tglx@...utronix.de>,
Makarand Sonare <makarandsonare@...gle.com>,
Sean Christopherson <seanjc@...gle.com>
Subject: Re: [PATCH] x86/tlb: Flush global mappings when KAISER is disabled
On Thu, 25 Mar 2021, Borislav Petkov wrote:
> Ok,
>
> I tried to be as specific as possible in the commit message so that we
> don't forget. Please lemme know if I've missed something.
>
> Babu, Jim, I'd appreciate it if you ran this to confirm.
>
> Thx.
>
> ---
> From: Borislav Petkov <bp@...e.de>
> Date: Thu, 25 Mar 2021 11:02:31 +0100
>
> Jim Mattson reported that Debian 9 guests using a 4.9-stable kernel
> are exploding during alternatives patching:
>
> kernel BUG at /build/linux-dqnRSc/linux-4.9.228/arch/x86/kernel/alternative.c:709!
> invalid opcode: 0000 [#1] SMP
> Modules linked in:
> CPU: 1 PID: 1 Comm: swapper/0 Not tainted 4.9.0-13-amd64 #1 Debian 4.9.228-1
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> Call Trace:
> swap_entry_free
> swap_entry_free
> text_poke_bp
> swap_entry_free
> arch_jump_label_transform
> set_debug_rodata
> __jump_label_update
> static_key_slow_inc
> frontswap_register_ops
> init_zswap
> init_frontswap
> do_one_initcall
> set_debug_rodata
> kernel_init_freeable
> rest_init
> kernel_init
> ret_from_fork
>
> triggering the BUG_ON in text_poke() which verifies whether patched
> instruction bytes have actually landed at the destination.
>
> Further debugging showed that the TLB flush before that check is
> insufficient because there could be global mappings left in the TLB,
> leading to a stale mapping getting used.
>
> I say "global mappings" because the hardware configuration is a new one:
> machine is an AMD, which means, KAISER/PTI doesn't need to be enabled
> there, which also means there's no user/kernel pagetables split and
> therefore the TLB can have global mappings.
>
> And the configuration is new one for a second reason: because that AMD
> machine supports PCID and INVPCID, which leads the CPU detection code to
> set the synthetic X86_FEATURE_INVPCID_SINGLE flag.
>
> Now, __native_flush_tlb_single() does invalidate global mappings when
> X86_FEATURE_INVPCID_SINGLE is *not* set and returns.
>
> When X86_FEATURE_INVPCID_SINGLE is set, however, it invalidates the
> requested address from both PCIDs in the KAISER-enabled case. But if
> KAISER is not enabled and the machine has global mappings in the TLB,
> then those global mappings do not get invalidated, which would lead to
> the above mismatch from using a stale TLB entry.
>
> So make sure to flush those global mappings in the KAISER disabled case.
>
> Co-debugged by Babu Moger <babu.moger@....com>.
>
> Reported-by: Jim Mattson <jmattson@...gle.com>
> Signed-off-by: Borislav Petkov <bp@...e.de>
> Link: https://lkml.kernel.org/r/CALMp9eRDSW66%2BXvbHVF4ohL7XhThoPoT0BrB0TcS0cgk=dkcBg@mail.gmail.com
Acked-by: Hugh Dickins <hughd@...gle.com>
Great write-up too: many thanks.
> ---
> arch/x86/include/asm/tlbflush.h | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
> index f5ca15622dc9..2bfa4deb8cae 100644
> --- a/arch/x86/include/asm/tlbflush.h
> +++ b/arch/x86/include/asm/tlbflush.h
> @@ -245,12 +245,15 @@ static inline void __native_flush_tlb_single(unsigned long addr)
> * ASID. But, userspace flushes are probably much more
> * important performance-wise.
> *
> - * Make sure to do only a single invpcid when KAISER is
> - * disabled and we have only a single ASID.
> + * In the KAISER disabled case, do an INVLPG to make sure
> + * the mapping is flushed in case it is a global one.
> */
> - if (kaiser_enabled)
> + if (kaiser_enabled) {
> invpcid_flush_one(X86_CR3_PCID_ASID_USER, addr);
> - invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + invpcid_flush_one(X86_CR3_PCID_ASID_KERN, addr);
> + } else {
> + asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
> + }
> }
>
> static inline void __flush_tlb_all(void)
> --
> 2.29.2
>
> --
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
Powered by blists - more mailing lists