lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 7 Jul 2016 02:50:09 +0000
From:	"Chen, Yu C" <yu.c.chen@...el.com>
To:	"Rafael J. Wysocki" <rjw@...ysocki.net>,
	James Morse <james.morse@....com>
CC:	"linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>,
	"H. Peter Anvin" <hpa@...or.com>, Pavel Machek <pavel@....cz>,
	Borislav Petkov <bp@...e.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>, Len Brown <lenb@...nel.org>,
	"x86@...nel.org" <x86@...nel.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: [PATCH][RFC v3] x86, hotplug: Use hlt instead of mwait if
 invoked from disable_nonboot_cpus


> -----Original Message-----
> From: Rafael J. Wysocki [mailto:rjw@...ysocki.net]
> Sent: Thursday, July 07, 2016 8:33 AM
> To: Chen, Yu C; James Morse
> Cc: linux-pm@...r.kernel.org; Thomas Gleixner; H. Peter Anvin; Pavel Machek;
> Borislav Petkov; Peter Zijlstra; Ingo Molnar; Len Brown; x86@...nel.org; linux-
> kernel@...r.kernel.org
> Subject: Re: [PATCH][RFC v3] x86, hotplug: Use hlt instead of mwait if invoked
> from disable_nonboot_cpus
> 
> On Tuesday, June 28, 2016 05:16:43 PM Chen Yu wrote:
> > Stress test from Varun Koyyalagunta reports that, the nonboot CPU
> > would hang occasionally, when resuming from hibernation. Further
> > investigation shows that, the precise stage when nonboot CPU hangs, is
> > the time when the nonboot CPU been woken up incorrectly, and tries to
> > monitor the mwait_ptr for the second time, then an exception is
> > triggered due to illegal vaddr access, say, something like, 'Unable to
> > handler kernel address of 0xffff8800ba800010...'
> >
> > Further investigation shows that, this exception is caused by
> > accessing a page without PRESENT flag, because the pte entry for this
> > vaddr is zero. Here's the scenario how this problem
> > happens: Page table for direct mapping is allocated dynamically by
> > kernel_physical_mapping_init, it is possible that in the resume
> > process, when the boot CPU is trying to write back pages to their
> > original address, and just right to writes to the monitor mwait_ptr
> > then wakes up one of the nonboot CPUs, since the page table currently
> > used by the nonboot CPU might not the same as it is before the
> > hibernation, an exception might occur due to inconsistent page table.
> >
> > First try is to get rid of this problem by changing the monitor
> > address from task.flag to zero page, because no one would write data
> > to zero page. But there is still problem because of a ping-pong wake
> > up scenario in mwait_play_dead:
> >
> > One possible implementation of a clflush is a read-invalidate snoop,
> > which is what a store might look like, so cflush might break the mwait.
> >
> > 1. CPU1 wait at zero page
> > 2. CPU2 cflush zero page, wake CPU1 up, then CPU2 waits at zero page
> > 3. CPU1 is woken up, and invoke cflush zero page, thus wake up CPU2 again.
> > then the nonboot CPUs never sleep for long.
> >
> > So it's better to monitor different address for each nonboot CPUs,
> > however since there is only one zero page, at most:
> > PAGE_SIZE/L1_CACHE_LINE CPUs are satisfied, which is usually 64 on a
> > x86_64, apparently it's not enough for servers, maybe more zero pages
> > are required.
> >
> > So choose a new solution as Brian suggested, to put the nonboot CPUs
> > into hlt before resume, without touching any memory during s/r.
> > Theoretically there might still be some problems if some of the CPUs
> > have already been put offline, but since the case is very rare and
> > users can work around it, we do not deal with this special case in
> > kernel for now.
> >
> > BTW, as James mentioned, he might want to encapsulate
> > disable_nonboot_cpus into arch-specific, so this patch might need small
> change after that.
> >
> > Comments and suggestions would be appreciated.
> >
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371
> > Reported-and-tested-by: Varun Koyyalagunta <cpudebug@...ttech.com>
> > Signed-off-by: Chen Yu <yu.c.chen@...el.com>
> 
> Below is my sort of version of this (untested) and I did it this way, because the
> issue is specific to resume from hibernation (the workaround need not be
> applied anywhere else) and the hibernate_resume_nonboot_cpu_disable()
> thing may be useful to arm64 too if I'm not mistaken (James?).

James might want a flag to distinguish whether it is from suspend or resume,
in his arch-specific disabled_nonboot_cpus?

and this patch works on my xeon.
Tested-by: Chen Yu <yu.c.chen@...el.com>

> 
> Actually, if arm64 uses it too, the __weak implementation can be dropped,
> because it will be possible to make it depend on ARCH_HIBERNATION_HEADER
> (x86 and arm64 are the only users of that).
> 
> Thanks,
> Rafael
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ