lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGXu5jLPN4N1f19OXJboepVuJhnXeeWQghanqDwh4bU=SQXZJw@mail.gmail.com>
Date:	Tue, 26 Jul 2016 13:59:48 -0700
From:	Kees Cook <keescook@...omium.org>
To:	"Rafael J. Wysocki" <rjw@...ysocki.net>
Cc:	Borislav Petkov <bp@...e.de>, Ingo Molnar <mingo@...nel.org>,
	Josh Poimboeuf <jpoimboe@...hat.com>,
	Pavel Machek <pavel@....cz>,
	Linux PM list <linux-pm@...r.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Thomas Gleixner <tglx@...utronix.de>, shuzzle@...lbox.org,
	Thomas Garnier <thgarnie@...gle.com>
Subject: Re: Fwd: [Bug 150021] New: kernel panic: "kernel tried to execute
 NX-protected page" when resuming from hibernate to disk

On Tue, Jul 26, 2016 at 1:53 PM, Rafael J. Wysocki <rjw@...ysocki.net> wrote:
> On Tuesday, July 26, 2016 01:33:02 PM Kees Cook wrote:
>> On Tue, Jul 26, 2016 at 1:24 PM, Rafael J. Wysocki <rjw@...ysocki.net> wrote:
>> > On Tuesday, July 26, 2016 04:04:42 PM Borislav Petkov wrote:
>> >> On Tue, Jul 26, 2016 at 01:32:28PM +0200, Rafael J. Wysocki wrote:
>> >> > Hi,
>> >> >
>> >> > The following commit:
>> >> >
>> >> > commit 13523309495cdbd57a0d344c0d5d574987af007f
>> >> > Author: Josh Poimboeuf <jpoimboe@...hat.com>
>> >> > Date:   Thu Jan 21 16:49:21 2016 -0600
>> >> >
>> >> >     x86/asm/acpi: Create a stack frame in do_suspend_lowlevel()
>> >> >
>> >> >     do_suspend_lowlevel() is a callable non-leaf function which doesn't
>> >> >     honor CONFIG_FRAME_POINTER, which can result in bad stack traces.
>> >> >
>> >> >     Create a stack frame for it when CONFIG_FRAME_POINTER is enabled.
>> >> >
>> >> > is reported to cause a resume-from-hibernation regression due to an attempt
>> >> > to execute an NX page (we've seen quite a bit of that recently).
>> >> >
>> >> > I'm asking the reporter to try 4.7, but if the problem is still there, we'll
>> >> > need to revert the above I'm afraid.
>> >>
>> >> So I can't resume properly from disk too, on the Intel laptop this time. Top
>> >> commit is from tip/master:
>> >>
>> >> commit 516f48acf59722429acd323b3d283f74f02891fe (refs/remotes/tip/master)
>> >> Merge: a4823bbffc96 dd9506954539
>> >> Author: Ingo Molnar <mingo@...nel.org>
>> >> Date:   Mon Jul 25 08:39:43 2016 +0200
>> >>
>> >>     Merge branch 'linus'
>> >>
>> >>
>> >> So I thought it might be Josh's patch above and reverted it. No joy.
>> >>
>> >> Then I remembered that I enabled CONFIG_RANDOMIZE_MEMORY for the
>> >> microcode loader breakage which we've been debugging. Turned that off
>> >> and machine resumes fine again.
>> >
>> > Well, I wasn't aware of *another* flavor of ASLR in the works.  And there
>> > was no hope it would not break hibernation if you asked me.
>> >
>> >> It looks like
>> >>
>> >>   0483e1fa6e09 ("x86/mm: Implement ASLR for kernel memory regions")
>> >>
>> >> broke a bunch of things. Off the top of my head, we probably should make
>> >> suspend to disk and CONFIG_RANDOMIZE_MEMORY mutually exclusive, like it
>> >> was the case with ASLR previously, AFAIR.
>> >
>> > Please no.
>> >
>> > First off, it should be perfectly possible to make hibernation work along
>> > with this new variant of ASLR.  Second, quite obviously, the author of these
>> > ASLR changes had not done sufficient research to estimate the possible
>> > impact of them.
>>
>> I think that's a bit unfair: Thomas did a lot of testing, and it has
>> been living in -next for a while.
>
> Well, with all due respect, "a lot of testing" is not quite the same thing as
> "sufficient research" IMO.
>
> It should be known (at least from experience) that hibernation on x86-64 doesn't
> play well with ASLR quite as a rule, so it would be good to at least check that
> particular thing or CC a relevant person (ie. me).

Fair enough: we need to practice considering a wider usage model.

> Or even ask me on IRC for that matter.  Give me a heads up ahead of time.
>
> But no.  I'm still on the receiving end of the "hibernation doesn't work with
> ASLR" story which was entirely avoidable this time around.  Sigh.

I'll be sure to keep you in the loop for future x86 KASLR changes;
sorry for the new pain. :(

>> > Honestly, I don't think it is a good idea to introduce random Kconfig options
>> > for working around cases in which the author of some changes cannot be bothered
>> > with doing things right.  Even if that is security.
>>
>> I would agree: let's try to get this fixed soon.
>>
>> > So IMO, either we should fix the problem, or that whole new ASLR stuff should
>> > be reverted.
>> >
>> > I think I know how to fix it, but I won't be able to get to that before the
>> > next week.  I guess it can wait till then, though.
>>
>> Thomas, will you have some time to examine this and estimate the work for a fix?
>
> FWIW, my hunch ATM is that you need to look at the "Set up the direct mapping
> from scratch" loop in set_up_temporary_mappings() and make it do the right
> thing when the new ASLR stuff is enabled.

Thanks for the pointer!

-Kees

-- 
Kees Cook
Chrome OS & Brillo Security

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ