lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200716232812.GA26338@dev-dsk-anchalag-2a-9c2d1d96.us-west-2.amazon.com>
Date:   Thu, 16 Jul 2020 23:28:12 +0000
From:   Anchal Agarwal <anchalag@...zon.com>
To:     Boris Ostrovsky <boris.ostrovsky@...cle.com>
CC:     "tglx@...utronix.de" <tglx@...utronix.de>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "bp@...en8.de" <bp@...en8.de>, "hpa@...or.com" <hpa@...or.com>,
        "x86@...nel.org" <x86@...nel.org>,
        "jgross@...e.com" <jgross@...e.com>,
        "linux-pm@...r.kernel.org" <linux-pm@...r.kernel.org>,
        "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "Kamata, Munehisa" <kamatam@...zon.com>,
        "sstabellini@...nel.org" <sstabellini@...nel.org>,
        "konrad.wilk@...cle.com" <konrad.wilk@...cle.com>,
        "roger.pau@...rix.com" <roger.pau@...rix.com>,
        "axboe@...nel.dk" <axboe@...nel.dk>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "rjw@...ysocki.net" <rjw@...ysocki.net>,
        "len.brown@...el.com" <len.brown@...el.com>,
        "pavel@....cz" <pavel@....cz>,
        "peterz@...radead.org" <peterz@...radead.org>,
        "Valentin, Eduardo" <eduval@...zon.com>,
        "Singh, Balbir" <sblbir@...zon.com>,
        "xen-devel@...ts.xenproject.org" <xen-devel@...ts.xenproject.org>,
        "vkuznets@...hat.com" <vkuznets@...hat.com>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "Woodhouse, David" <dwmw@...zon.co.uk>,
        "benh@...nel.crashing.org" <benh@...nel.crashing.org>
Subject: Re: [PATCH v2 00/11] Fix PM hibernation in Xen guests

On Wed, Jul 15, 2020 at 04:49:57PM -0400, Boris Ostrovsky wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> 
> 
> 
> On 7/15/20 3:49 PM, Anchal Agarwal wrote:
> > On Mon, Jul 13, 2020 at 03:43:33PM -0400, Boris Ostrovsky wrote:
> >> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> >>
> >>
> >>
> >> On 7/10/20 2:17 PM, Agarwal, Anchal wrote:
> >>> Gentle ping on this series.
> >>
> >> Have you tested save/restore?
> >>
> > No, not with the last few series. But a good point, I will test that and get
> > back to you. Do you see anything specific in the series that suggests otherwise?
> 
> 
> root@...104> xl save pvh saved
> Saving to saved new xl format (info 0x3/0x0/1699)
> xc: info: Saving domain 3, type x86 HVM
> xc: Frames: 1044480/1044480  100%
> xc: End of stream: 0/0    0%
> root@...104> xl restore saved
> Loading new save file saved (new xl fmt info 0x3/0x0/1699)
>  Savefile contains xl domain config in JSON format
> Parsing config from <saved>
> xc: info: Found x86 HVM domain from Xen 4.13
> xc: info: Restoring domain
> xc: info: Restore successful
> xc: info: XenStore: mfn 0xfeffc, dom 0, evt 1
> xc: info: Console: mfn 0xfefff, dom 0, evt 2
> root@...104> xl console pvh
> [  139.943872] ------------[ cut here ]------------
> [  139.943872] kernel BUG at arch/x86/xen/enlighten.c:205!
> [  139.943872] invalid opcode: 0000 [#1] SMP PTI
> [  139.943872] CPU: 0 PID: 11 Comm: migration/0 Not tainted 5.8.0-rc5 #26
> [  139.943872] RIP: 0010:xen_vcpu_setup+0x16d/0x180
> [  139.943872] Code: 4a 8b 14 f5 40 c9 1b 82 48 89 d8 48 89 2c 02 8b 05
> a4 d4 40 01 85 c0 0f 85 15 ff ff ff 4a 8b 04 f5 40 c9 1b 82 e9 f4 fe ff
> ff <0f> 0b b8 ed ff ff ff e9 14 ff ff ff e8 12 4f 86 00 66 90 66 66 66
> [  139.943872] RSP: 0018:ffffc9000006bdb0 EFLAGS: 00010046
> [  139.943872] RAX: 0000000000000000 RBX: ffffc9000014fe00 RCX:
> 0000000000000000
> [  139.943872] RDX: ffff88803fc00000 RSI: 0000000000016128 RDI:
> 0000000000000000
> [  139.943872] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [  139.943872] R10: ffffffff826174a0 R11: ffffc9000006bcb4 R12:
> 0000000000016120
> [  139.943872] R13: 0000000000016120 R14: 0000000000016128 R15:
> 0000000000000000
> [  139.943872] FS:  0000000000000000(0000) GS:ffff88803fc00000(0000)
> knlGS:0000000000000000
> [  139.943872] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  139.943872] CR2: 00007f704be8b000 CR3: 000000003901a004 CR4:
> 00000000000606f0
> [  139.943872] Call Trace:
> [  139.943872]  ? __kmalloc+0x167/0x260
> [  139.943872]  ? xen_manage_runstate_time+0x14a/0x170
> [  139.943872]  xen_vcpu_restore+0x134/0x170
> [  139.943872]  xen_hvm_post_suspend+0x1d/0x30
> [  139.943872]  xen_arch_post_suspend+0x13/0x30
> [  139.943872]  xen_suspend+0x87/0x190
> [  139.943872]  multi_cpu_stop+0x6d/0x110
> [  139.943872]  ? stop_machine_yield+0x10/0x10
> [  139.943872]  cpu_stopper_thread+0x47/0x100
> [  139.943872]  smpboot_thread_fn+0xc5/0x160
> [  139.943872]  ? sort_range+0x20/0x20
> [  139.943872]  kthread+0xfe/0x140
> [  139.943872]  ? kthread_park+0x90/0x90
> [  139.943872]  ret_from_fork+0x22/0x30
> [  139.943872] Modules linked in:
> [  139.943872] ---[ end trace 74716859a6b4f0a8 ]---
> [  139.943872] RIP: 0010:xen_vcpu_setup+0x16d/0x180
> [  139.943872] Code: 4a 8b 14 f5 40 c9 1b 82 48 89 d8 48 89 2c 02 8b 05
> a4 d4 40 01 85 c0 0f 85 15 ff ff ff 4a 8b 04 f5 40 c9 1b 82 e9 f4 fe ff
> ff <0f> 0b b8 ed ff ff ff e9 14 ff ff ff e8 12 4f 86 00 66 90 66 66 66
> [  139.943872] RSP: 0018:ffffc9000006bdb0 EFLAGS: 00010046
> [  139.943872] RAX: 0000000000000000 RBX: ffffc9000014fe00 RCX:
> 0000000000000000
> [  139.943872] RDX: ffff88803fc00000 RSI: 0000000000016128 RDI:
> 0000000000000000
> [  139.943872] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> [  139.943872] R10: ffffffff826174a0 R11: ffffc9000006bcb4 R12:
> 0000000000016120
> [  139.943872] R13: 0000000000016120 R14: 0000000000016128 R15:
> 0000000000000000
> [  139.943872] FS:  0000000000000000(0000) GS:ffff88803fc00000(0000)
> knlGS:0000000000000000
> [  139.943872] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  139.943872] CR2: 00007f704be8b000 CR3: 000000003901a004 CR4:
> 00000000000606f0
> [  139.943872] Kernel panic - not syncing: Fatal exception
> [  139.943872] Shutting down cpus with NMI
> [  143.927559] Kernel Offset: disabled
> root@...104>
>
I think I may have found a bug. There were no issues with V1 version  that I
send however, there were issues with V2. I tested both series and found xl
save/restore to be working in V1 but not in V2. I should have tested it.
Anyways, looks the issue is coming from executing syscore ops registered for
hibernation use case during call to xen_suspend. 
I remember your comment from earlier where you did ask why we need to
check xen_suspend mode xen_syscore_suspend [patch-004] and I removed that based
on my theoretical understanding of your suggestion that since lock_system_sleep() lock
is taken, we cannot initialize hibernation. I skipped to check the part in the
code where during xen_suspend(), all registered syscore_ops suspend callbacks are
called. Hence the ones registered for PM hibernation will also be called.
With no check there on suspend mode, it fails to return from the function and
they never should be executed in case of xen suspend.
I will revert a part of that check in Patch-004 from V1 and send an updated patch with
the fix.

Thanks,
Anchal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ