lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Fri, 5 Nov 2021 16:41:16 -0400
From:   Boris Ostrovsky <boris.ostrovsky@...cle.com>
To:     Dongli Zhang <dongli.zhang@...cle.com>,
        xen-devel@...ts.xenproject.org, x86@...nel.org
Cc:     jgross@...e.com, sstabellini@...nel.org, tglx@...utronix.de,
        mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
        hpa@...or.com, joe.jin@...cle.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 1/1] xen: delay xen_hvm_init_time_ops() if kdump is
 boot on vcpu>=32


On 11/4/21 1:59 PM, Dongli Zhang wrote:
> Hi Boris,
>
> On 11/1/21 10:34 AM, Boris Ostrovsky wrote:
>> On 10/27/21 9:25 PM, Dongli Zhang wrote:
>>> The sched_clock() can be used very early since
>>> commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
>>> with commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock time from
>>> 0"), kdump kernel in Xen HVM guest may panic at very early stage when
>>> accessing &__this_cpu_read(xen_vcpu)->time as in below:
>>>
>>> setup_arch()
>>>    -> init_hypervisor_platform()
>>>        -> x86_init.hyper.init_platform = xen_hvm_guest_init()
>>>            -> xen_hvm_init_time_ops()
>>>                -> xen_clocksource_read()
>>>                    -> src = &__this_cpu_read(xen_vcpu)->time;
>>>
>>> This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
>>> embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
>>> used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
>>>
>>> However, when Xen HVM guest panic on vcpu >= 32, since
>>> xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
>>> vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.
>>>
>>> This patch delays xen_hvm_init_time_ops() to later in
>>> xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
>>> registered when the boot vcpu is >= 32.
>>>
>>> Another option is to always delay xen_hvm_init_time_ops() for any vcpus
>>> (including vcpu=0). Since to delay xen_hvm_init_time_ops() may lead to
>>> clock backward issue,
>>
>> This is referring to
>> https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg01516.html I
>> assume?
> No.
>
> So far there are clock forward (e.g., from 0 to 65345) issue and clock backward
> issue (e.g., from 2.432 to 0).
>
> The clock forward issue can be resolved by above link to enforce clock update
> during vcpu info registration. However, so far I am only able to reproduce clock
> forward when "taskset -c 33 echo c > /proc/sysrq-trigger".
>
> That is, I am not able to see any clock forward drift during regular boot (on
> CPU=0), without the fix at hypervisor side.
>
> The clock backward issue is because the native clock source is used if we delay
> the initialization of xen clock source. We will see a backward when the source
> is switched from native to xen.
>
>>
>>>    it is preferred to avoid that for regular boot (The
>>> pv_sched_clock=native_sched_clock() is used at the very beginning until
>>> xen_sched_clock() is registered). That requires to adjust
>>> xen_sched_clock_offset. That's why we only delay xen_hvm_init_time_ops()
>>> for vcpu>=32.
>>
>> We delay only on VCPU>=32 because we want to avoid the clock going backwards due
>> to hypervisor problem pointed to be the link above, not because we need to
>> adjust xen_sched_clock_offset (which we could if we wanted).
> I will add that.
>
> Just to clarify that so far I am not able to reproduce the clock backward issue
> during regular boot (on CPU=0), when the fix is not available at hypervisor side.


FTR, Dongli and I had a chat and he is going to provide another version of the patch where time initialization is deferred for all vcpus.



-boris


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ