lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <eb8f274b-03d9-6b1c-5a4e-d004bdde2804@redhat.com>
Date:   Thu, 31 Aug 2023 10:55:29 +1000
From:   Gavin Shan <gshan@...hat.com>
To:     David Hildenbrand <david@...hat.com>,
        virtualization@...ts.linux-foundation.org
Cc:     linux-kernel@...r.kernel.org, jasowang@...hat.com, mst@...hat.com,
        xuanzhuo@...ux.alibaba.com, shan.gavin@...il.com
Subject: Re: [PATCH] virtio_balloon: Fix endless deflation and inflation on
 arm64

On 8/31/23 02:30, David Hildenbrand wrote:
> On 29.08.23 03:54, Gavin Shan wrote:
>> The deflation request to the target, which isn't unaligned to the
>> guest page size causes endless deflation and inflation actions. For
>> example, we receive the flooding QMP events for the changes on memory
>> balloon's size after a deflation request to the unaligned target is
>> sent for the ARM64 guest, where we have 64KB base page size.
>>
>>    /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64      \
>>    -accel kvm -machine virt,gic-version=host -cpu host          \
>>    -smp maxcpus=8,cpus=8,sockets=2,clusters=2,cores=2,threads=1 \
>>    -m 1024M,slots=16,maxmem=64G                                 \
>>    -object memory-backend-ram,id=mem0,size=512M                 \
>>    -object memory-backend-ram,id=mem1,size=512M                 \
>>    -numa node,nodeid=0,memdev=mem0,cpus=0-3                     \
>>    -numa node,nodeid=1,memdev=mem1,cpus=4-7                     \
>>      :                                                          \
>>    -device virtio-balloon-pci,id=balloon0,bus=pcie.10
>>
>>    { "execute" : "balloon", "arguments": { "value" : 1073672192 } }
>>    {"return": {}}
>>    {"timestamp": {"seconds": 1693272173, "microseconds": 88667},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272174, "microseconds": 89704},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272175, "microseconds": 90819},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272176, "microseconds": 91961},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272177, "microseconds": 93040},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
>>    {"timestamp": {"seconds": 1693272178, "microseconds": 94117},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
>>    {"timestamp": {"seconds": 1693272179, "microseconds": 95337},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272180, "microseconds": 96615},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
>>    {"timestamp": {"seconds": 1693272181, "microseconds": 97626},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272182, "microseconds": 98693},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
>>    {"timestamp": {"seconds": 1693272183, "microseconds": 99698},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272184, "microseconds": 100727},  \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272185, "microseconds": 90430},   \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    {"timestamp": {"seconds": 1693272186, "microseconds": 102999},  \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073676288}}
>>       :
>>    <The similar QMP events repeat>
>>
>> Fix it by having the target aligned to the guest page size, 64KB
>> in this specific case. With this applied, no flooding QMP event
>> is observed and the memory balloon's size can be stablizied to
>> 0x3ffe0000 soon after the deflation request is sent.
>>
>>    { "execute" : "balloon", "arguments": { "value" : 1073672192 } }
>>    {"return": {}}
>>    {"timestamp": {"seconds": 1693273328, "microseconds": 793075},  \
>>     "event": "BALLOON_CHANGE", "data": {"actual": 1073610752}}
>>    { "execute" : "query-balloon" }
>>    {"return": {"actual": 1073610752}}
>>
>> Signed-off-by: Gavin Shan <gshan@...hat.com>
>> ---
>>   drivers/virtio/virtio_balloon.c | 13 ++++++++++++-
>>   1 file changed, 12 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
>> index 5b15936a5214..625caac35264 100644
>> --- a/drivers/virtio/virtio_balloon.c
>> +++ b/drivers/virtio/virtio_balloon.c
>> @@ -386,6 +386,17 @@ static void stats_handle_request(struct virtio_balloon *vb)
>>       virtqueue_kick(vq);
>>   }
>> +static inline s64 align_pages_up(s64 diff)
>> +{
>> +    if (diff == 0)
>> +        return diff;
>> +
>> +    if (diff > 0)
>> +        return ALIGN(diff, VIRTIO_BALLOON_PAGES_PER_PAGE);
>> +
>> +    return -ALIGN(-diff, VIRTIO_BALLOON_PAGES_PER_PAGE);
>> +}
>> +
>>   static inline s64 towards_target(struct virtio_balloon *vb)
>>   {
>>       s64 target;
>> @@ -396,7 +407,7 @@ static inline s64 towards_target(struct virtio_balloon *vb)
>>               &num_pages);
>>       target = num_pages;
>> -    return target - vb->num_pages;
> 
> We know that vb->num_pages is always multiples of VIRTIO_BALLOON_PAGES_PER_PAGE.
> 
> Why not simply align target down?
> 
> target = ALIGN(num_pages, VIRTIO_BALLOON_PAGES_PER_PAGE);
> return target - vb->num_pages;
> 

Good point. Thanks a lot, David. The code will be changed to what's suggested in
v2, to be posted soon. I will also add a comment to explain it a bit. Besides, ALIGN()
is align-up instead of align-down to give bias to deflation intentionally, to avoid
overrunning the machine's memory size if it's not aligned to 64KB. Further more,
the align-up causes deflation even user requests a 4KB diff. However, the outcome
of ALIGN_DOWN(4KB, 64KB) is zero and no deflation will be triggered.

Thanks,
Gavin

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ