linux-kernel - Re: [PATCH v2] x86/power: Fix 'nosmt' vs. hibernation triple fault during resume

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190529161028.a6kpywzpjazgql5u@treble>
Date:   Wed, 29 May 2019 11:10:28 -0500
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Jiri Kosina <jikos@...nel.org>
Cc:     "Rafael J. Wysocki" <rjw@...ysocki.net>,
        Pavel Machek <pavel@....cz>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        Peter Zijlstra <peterz@...radead.org>, x86@...nel.org,
        linux-pm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] x86/power: Fix 'nosmt' vs. hibernation triple fault
 during resume

On Wed, May 29, 2019 at 12:32:02PM +0200, Jiri Kosina wrote:
> From: Jiri Kosina <jkosina@...e.cz>
> 
> As explained in
> 
> 	0cc3cd21657b ("cpu/hotplug: Boot HT siblings at least once")
> 
> we always, no matter what, have to bring up x86 HT siblings during boot at 
> least once in order to avoid first MCE bringing the system to its knees.
> 
> That means that whenever 'nosmt' is supplied on the kernel command-line, 
> all the HT siblings are as a result sitting in mwait or cpudile after 
> going through the online-offline cycle at least once.
> 
> This causes a serious issue though when a kernel, which saw 'nosmt' on its 
> commandline, is going to perform resume from hibernation: if the resume 
> from the hibernated image is successful, cr3 is flipped in order to point 
> to the address space of the kernel that is being resumed, which in turn 
> means that all the HT siblings are all of a sudden mwaiting on address 
> which is no longer valid.
> 
> That results in triple fault shortly after cr3 is switched, and machine 
> reboots.
> 
> Fix this by always waking up all the SMT siblings before initiating the 
> 'restore from hibernation' process; this guarantees that all the HT 
> siblings will be properly carried over to the resumed kernel waiting in 
> resume_play_dead(), and acted upon accordingly afterwards, based on the 
> target kernel configuration.

hibernation_restore() is called by user space at runtime, via ioctl or
sysfs.  So I think this still doesn't fix the case where you've disabled
CPUs at runtime via sysfs, and then resumed from hibernation.  Or are we
declaring that this is not a supported scenario?

Would it be possible for mwait_play_dead() to instead just monitor a
fixmap address which doesn't change for kaslr?

Is there are reason why maxcpus= doesn't do the CR4.MCE booted_once
dance?

-- 
Josh