lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Nov 2017 09:19:15 +0800
From:   Dongli Zhang <dongli.zhang@...cle.com>
To:     Boris Ostrovsky <boris.ostrovsky@...cle.com>,
        xen-devel@...ts.xenproject.org, linux-kernel@...r.kernel.org
Cc:     jgross@...e.com, joao.m.martins@...cle.com
Subject: Re: [PATCH v6 1/1] xen/time: do not decrease steal time after live
 migration on xen

Hi Boris,

I have received from lkp@...el.com that the prior version of patch hit issue
during compilation with aarch64-linux-gnu-gcc. I think this patch reviewed by
you would hit the same compiling issue on arm64 (there is no issue with x86_64).

-------------------------------------------------------------

1st issue:

Without including header <linux/slab.h> into driver/xen/time.c, compilation on
x86_64 works well (without any warning or error) but arm64 would hit the
following error:

drivers/xen/time.c: In function ‘xen_manage_runstate_time’:
drivers/xen/time.c:94:20: error: implicit declaration of function
‘kmalloc_array’ [-Werror=implicit-function-declaration]
   runstate_delta = kmalloc_array(num_possible_cpus(),
                    ^

drivers/xen/time.c:131:3: error: implicit declaration of function ‘kfree’
[-Werror=implicit-function-declaration]
   kfree(runstate_delta);
   ^
cc1: some warnings being treated as errors

About the 1st issue, should I submit a new patch including <linux/slab.h> or
just a incremental based on previous patch merged into your own branch
/tree?

-------------------------------------------------------------

2nd issue:

aarch64-linux-gnu-gcc expects a cast for kmalloc_array(). Is this really
necessary as I did find people casting the return type of
kmalloc/kcalloc/kmalloc_array in linux source code (e.g.,
drivers/block/virtio_blk.c). Can we just ignore this warning?

drivers/xen/time.c:94:18: warning: assignment makes pointer from integer without
a cast [-Wint-conversion]
   runstate_delta = kmalloc_array(num_possible_cpus(),
                  ^
-------------------------------------------------------------


Thank you very much!

Dongli Zhang


On 11/02/2017 03:19 AM, Boris Ostrovsky wrote:
> On 10/31/2017 09:46 PM, Dongli Zhang wrote:
>> After guest live migration on xen, steal time in /proc/stat
>> (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
>> xen_steal_lock() might be less than this_rq()->prev_steal_time which is
>> derived from previous return value of xen_steal_clock().
>>
>> For instance, steal time of each vcpu is 335 before live migration.
>>
>> cpu  198 0 368 200064 1962 0 0 1340 0 0
>> cpu0 38 0 81 50063 492 0 0 335 0 0
>> cpu1 65 0 97 49763 634 0 0 335 0 0
>> cpu2 38 0 81 50098 462 0 0 335 0 0
>> cpu3 56 0 107 50138 374 0 0 335 0 0
>>
>> After live migration, steal time is reduced to 312.
>>
>> cpu  200 0 370 200330 1971 0 0 1248 0 0
>> cpu0 38 0 82 50123 500 0 0 312 0 0
>> cpu1 65 0 97 49832 634 0 0 312 0 0
>> cpu2 39 0 82 50167 462 0 0 312 0 0
>> cpu3 56 0 107 50207 374 0 0 312 0 0
>>
>> Since runstate times are cumulative and cleared during xen live migration
>> by xen hypervisor, the idea of this patch is to accumulate runstate times
>> to global percpu variables before live migration suspend. Once guest VM is
>> resumed, xen_get_runstate_snapshot_cpu() would always return the sum of new
>> runstate times and previously accumulated times stored in global percpu
>> variables. Comments before the call of HYPERVISOR_suspend() has been
>> removed as it is inaccurate. The call can return an error code (e.g.,
>> possibly -EPERM in the future).
> 
> I'd like split comment removal bit into a separate paragraph. I can do
> this when committing if you don't mind.
> 
>>
>> Similar and more severe issue would impact prior linux 4.8-4.10 as
>> discussed by Michael Las at
>> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest,
>> which would overflow steal time and lead to 100% st usage in top command
>> for linux 4.8-4.10. A backport of this patch would fix that issue.
>>
>> References: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
>> Signed-off-by: Dongli Zhang <dongli.zhang@...cle.com>
>>
>> ---
> 
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@...cle.com>
> 
> 

Powered by blists - more mailing lists