linux-kernel - Re: [PATCH v4] x86/power: Fix 'nosmt' vs. hibernation triple fault during resume

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1639292.WxYr0VA4et@kreacher>
Date:   Fri, 31 May 2019 10:26:37 +0200
From:   "Rafael J. Wysocki" <rjw@...ysocki.net>
To:     Josh Poimboeuf <jpoimboe@...hat.com>,
        Jiri Kosina <jikos@...nel.org>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        the arch/x86 maintainers <x86@...nel.org>,
        Pavel Machek <pavel@....cz>, Ingo Molnar <mingo@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Linux PM <linux-pm@...r.kernel.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        Andy Lutomirski <luto@...nel.org>
Subject: Re: [PATCH v4] x86/power: Fix 'nosmt' vs. hibernation triple fault during resume

On Friday, May 31, 2019 7:14:56 AM CEST Josh Poimboeuf wrote:
> On Fri, May 31, 2019 at 01:42:02AM +0200, Jiri Kosina wrote:
> > On Thu, 30 May 2019, Josh Poimboeuf wrote:
> > 
> > > > >     Reviewed-by: Thomas Gleixner <tglx@...utronix.de>
> > > > 
> > > > Yes, it is, thanks!
> > > 
> > > I still think changing monitor/mwait to use a fixmap address would be a
> > > much cleaner way to fix this.  I can try to work up a patch tomorrow.
> > 
> > I disagree with that from the backwards compatibility point of view.
> > 
> > I personally am quite frequently using differnet combinations of 
> > resumer/resumee kernels, and I've never been biten by it so far. I'd guess 
> > I am not the only one.
> > Fixmap sort of breaks that invariant.
> 
> Right now there is no backwards compatibility because nosmt resume is
> already broken.
> 
> For "future" backwards compatibility we could just define a hard-coded
> reserved fixmap page address, adjacent to the vsyscall reserved address.
> 
> Something like this (not yet tested)?  Maybe we could also remove the
> resume_play_dead() hack?

Yes, we can IMO, but in a separate patch, please.

> diff --git a/arch/x86/include/asm/fixmap.h b/arch/x86/include/asm/fixmap.h
> index 9da8cccdf3fb..1c328624162c 100644
> --- a/arch/x86/include/asm/fixmap.h
> +++ b/arch/x86/include/asm/fixmap.h
> @@ -80,6 +80,7 @@ enum fixed_addresses {
>  #ifdef CONFIG_X86_VSYSCALL_EMULATION
>  	VSYSCALL_PAGE = (FIXADDR_TOP - VSYSCALL_ADDR) >> PAGE_SHIFT,
>  #endif
> +	FIX_MWAIT = (FIXADDR_TOP - VSYSCALL_ADDR - 1) >> PAGE_SHIFT,
>  #endif
>  	FIX_DBGP_BASE,
>  	FIX_EARLYCON_MEM_BASE,
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index 73e69aaaa117..9804fbe25d03 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -108,6 +108,8 @@ int __read_mostly __max_smt_threads = 1;
>  /* Flag to indicate if a complete sched domain rebuild is required */
>  bool x86_topology_update;
>  
> +static char __mwait_page[PAGE_SIZE];
> +
>  int arch_update_cpu_topology(void)
>  {
>  	int retval = x86_topology_update;
> @@ -1319,6 +1321,8 @@ void __init native_smp_prepare_cpus(unsigned int max_cpus)
>  	smp_quirk_init_udelay();
>  
>  	speculative_store_bypass_ht_init();
> +
> +	set_fixmap(FIX_MWAIT, __pa_symbol(&__mwait_page));
>  }
>  
>  void arch_enable_nonboot_cpus_begin(void)
> @@ -1631,11 +1635,12 @@ static inline void mwait_play_dead(void)
>  	}
>  
>  	/*
> -	 * This should be a memory location in a cache line which is
> -	 * unlikely to be touched by other processors.  The actual
> -	 * content is immaterial as it is not actually modified in any way.
> +	 * This memory location is never actually written to.  It's mapped at a
> +	 * reserved fixmap address to ensure the monitored address remains
> +	 * valid across a hibernation resume operation.  Otherwise a triple
> +	 * fault can occur.
>  	 */
> -	mwait_ptr = &current_thread_info()->flags;
> +	mwait_ptr = (void *)fix_to_virt(FIX_MWAIT);
>  
>  	wbinvd();
>  
> 

Jiri, any chance to test this?