lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2794895b-6417-7164-8417-0f1edc52ae10@oracle.com>
Date:   Tue, 12 Oct 2021 08:50:17 -0700
From:   Dongli Zhang <dongli.zhang@...cle.com>
To:     Juergen Gross <jgross@...e.com>, xen-devel@...ts.xenproject.org
Cc:     linux-kernel@...r.kernel.org, x86@...nel.org,
        boris.ostrovsky@...cle.com, sstabellini@...nel.org,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
        andrew.cooper3@...rix.com, george.dunlap@...rix.com,
        iwj@...project.org, jbeulich@...e.com, julien@....org, wl@....org,
        joe.jin@...cle.com
Subject: Re: [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue

Hi Juergen,

On 10/12/21 1:47 AM, Juergen Gross wrote:
> On 12.10.21 09:24, Dongli Zhang wrote:
>> When the kdump/kexec is enabled at HVM VM side, to panic kernel will trap
>> to xen side with reason=soft_reset. As a result, the xen will reboot the VM
>> with the kdump kernel.
>>
>> Unfortunately, when the VM is panic with below command line ...
>>
>> "taskset -c 33 echo c > /proc/sysrq-trigger"
>>
>> ... the kdump kernel is panic at early stage ...
>>
>> PANIC: early exception 0x0e IP 10:ffffffffa8c66876 error 0 cr2 0x20
>> [    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc5xen #1
>> [    0.000000] Hardware name: Xen HVM domU
>> [    0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0
>> ... ...
>> [    0.000000] RSP: 0000:ffffffffaa203e20 EFLAGS: 00010082 ORIG_RAX:
>> 0000000000000000
>> [    0.000000] RAX: 0000000000000003 RBX: 0000000000010000 RCX: 00000000ffffdfff
>> [    0.000000] RDX: 0000000000000003 RSI: 00000000ffffdfff RDI: 0000000000000020
>> [    0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09: 0000000000000001
>> [    0.000000] R10: ffffffffaa203e00 R11: ffffffffaa203c70 R12: 0000000040000004
>> [    0.000000] R13: ffffffffaa203e5c R14: ffffffffaa203e58 R15: 0000000000000000
>> [    0.000000] FS:  0000000000000000(0000) GS:ffffffffaa95e000(0000)
>> knlGS:0000000000000000
>> [    0.000000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [    0.000000] CR2: 0000000000000020 CR3: 00000000ec9e0000 CR4: 00000000000406a0
>> [    0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [    0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [    0.000000] Call Trace:
>> [    0.000000]  ? xen_init_time_common+0x11/0x55
>> [    0.000000]  ? xen_hvm_init_time_ops+0x23/0x45
>> [    0.000000]  ? xen_hvm_guest_init+0x214/0x251
>> [    0.000000]  ? 0xffffffffa8c00000
>> [    0.000000]  ? setup_arch+0x440/0xbd6
>> [    0.000000]  ? start_kernel+0x6a/0x689
>> [    0.000000]  ? secondary_startup_64_no_verify+0xc2/0xcb
>>
>> This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
>> embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
>> used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
>>
>>
>> The 1st patch is to fix the issue at VM kernel side. However, we may
>> observe clock drift at VM side due to the issue at xen hypervisor side.
>> This is because the pv vcpu_time_info is not updated when
>> VCPUOP_register_vcpu_info.
>>
>> The 2nd patch is to force_update_vcpu_system_time() at xen side when
>> VCPUOP_register_vcpu_info, to avoid the VM clock drift during kdump kernel
>> boot.
> 
> Please don't mix patches for multiple projects in one series.
> 
> In cases like this it is fine to mention the other project's patch
> verbally instead.
> 

I will split the patchset in v2 and email to different projects.

The core ideas of this combined patchset are:

1. Fix at HVM domU side (kdump kernel panic)

2. Fix at Xen hypervisor side (clock drift issue in kdump kernel)

3. To report (or seek for help) that soft_reset does not work with mainline-xen
so that I am not able to test my patchset with the most recent mainline xen.

Thank you very much!

Dongli Zhang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ