[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20161008170312.GC24225@sharon>
Date: Sun, 9 Oct 2016 01:03:12 +0800
From: Chen Yu <yu.c.chen@...el.com>
To: joeyli <jlee@...e.com>
Cc: linux-pm@...r.kernel.org, "Rafael J. Wysocki" <rjw@...ysocki.net>,
Pavel Machek <pavel@....cz>,
Thomas Gleixner <tglx@...utronix.de>,
Ingo Molnar <mingo@...hat.com>,
"H. Peter Anvin" <hpa@...or.com>, x86@...nel.org,
linux-kernel@...r.kernel.org,
"Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
Borislav Petkov <bp@...en8.de>
Subject: Re: [PATCH][v11] PM / hibernate: Verify the consistent of e820
memory map by md5 digest
Hi Joey,
On Sat, Oct 08, 2016 at 12:31:08AM +0800, joeyli wrote:
> Hi Chen Yu,
>
> On Sun, Sep 25, 2016 at 12:17:57PM +0800, Chen Yu wrote:
> > On some platforms, there is occasional panic triggered when trying to
> > resume from hibernation, a typical panic looks like:
> >
> > "BUG: unable to handle kernel paging request at ffff880085894000
> > IP: [<ffffffff810c5dc2>] load_image_lzo+0x8c2/0xe70"
> >
> > Investigation carried out by Lee Chun-Yi shows that this is because
> > e820 map has been changed by BIOS across hibernation, and one
> > of the page frames from suspend kernel is right located in restore
> > kernel's unmapped region, so panic comes out when accessing unmapped
> > kernel address.
> >
>
> Sorry for finally I can not find the issue machine back now. So I add
> a patch to fool kernel as the e820 changed when S4 resume for testing.
>
> > In order to expose this issue earlier, the md5 hash of e820 map
> > is passed from suspend kernel to restore kernel, and the restore
> > kernel will terminate the resume process once it finds the md5
> > hash are not the same.
> >
> [...snip]
> > ---
> > arch/x86/power/hibernate_64.c | 92 ++++++++++++++++++++++++++++++++++++++++++-
> > 1 file changed, 90 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/power/hibernate_64.c b/arch/x86/power/hibernate_64.c
> > index 9634557..d81b1af 100644
> > --- a/arch/x86/power/hibernate_64.c
> > +++ b/arch/x86/power/hibernate_64.c
> > @@ -11,6 +11,10 @@
> > #include <linux/gfp.h>
> > #include <linux/smp.h>
> > #include <linux/suspend.h>
> > +#include <linux/scatterlist.h>
> > +#include <linux/kdebug.h>
>
> [...snip]
>
> > @@ -216,5 +297,12 @@ int arch_hibernation_header_restore(void *addr)
> > restore_jump_address = rdr->jump_address;
> > jump_address_phys = rdr->jump_address_phys;
> > restore_cr3 = rdr->cr3;
> > - return (rdr->magic == RESTORE_MAGIC) ? 0 : -EINVAL;
> > +
> > + if (rdr->magic != RESTORE_MAGIC)
> > + return -EINVAL;
> > +
> > + if (hibernation_e820_mismatch(rdr->e820_digest))
> > + return -ENODEV;
> > +
> > + return 0;
> > }
> > --
>
> Because the check_image_kernel() function doesn't check the return error,
> kernel only shows "PM: Image mismatch: architecture specific data". The
> message covered two different fail reason.
>
> I suggest that it prints out a log like the restore function in ARM64
> architecture. Something like this, please feel free to modify the
> wording:
>
> Index: linux/arch/x86/power/hibernate_64.c
> ===================================================================
> --- linux.orig/arch/x86/power/hibernate_64.c
> +++ linux/arch/x86/power/hibernate_64.c
> @@ -298,11 +298,16 @@ int arch_hibernation_header_restore(void
> jump_address_phys = rdr->jump_address_phys;
> restore_cr3 = rdr->cr3;
>
> - if (rdr->magic != RESTORE_MAGIC)
> +
> + if (rdr->magic != RESTORE_MAGIC) {
> + pr_crit("Hibernate image not generated by this kernel!\n");
> return -EINVAL;
> + }
>
> - if (hibernation_e820_mismatch(rdr->e820_digest))
> + if (hibernation_e820_mismatch(rdr->e820_digest)) {
> + pr_crit("The e820 saved regions changed!\n");
> return -ENODEV;
> + }
>
> return 0;
> }
>
OK, will refresh it after 4.9-rc1 released due to a e820 modification
recently.
Thanks,
Yu
Powered by blists - more mailing lists