[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a64d0414-652b-88c4-9155-014555a801a3@redhat.com>
Date: Mon, 9 Mar 2020 11:59:54 +0100
From: David Hildenbrand <david@...hat.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
Cc: Tyler Sanderson <tysand@...gle.com>, linux-kernel@...r.kernel.org,
linux-mm@...ck.org, virtualization@...ts.linux-foundation.org,
Wei Wang <wei.w.wang@...el.com>,
Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
David Rientjes <rientjes@...gle.com>,
Nadav Amit <namit@...are.com>, Michal Hocko <mhocko@...nel.org>
Subject: Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for
VIRTIO_BALLOON_F_DEFLATE_ON_OOM
On 09.03.20 11:14, Michael S. Tsirkin wrote:
> On Mon, Mar 09, 2020 at 10:03:14AM +0100, David Hildenbrand wrote:
>> On 08.03.20 05:47, Tyler Sanderson wrote:
>>> Tested-by: Tyler Sanderson <tysand@...gle.com>
>>>
>>> Test setup: VM with 16 CPU, 64GB RAM. Running Debian 10. We have a 42
>>> GB file full of random bytes that we continually cat to /dev/null.
>>> This fills the page cache as the file is read. Meanwhile we trigger
>>> the balloon to inflate, with a target size of 53 GB. This setup causes
>>> the balloon inflation to pressure the page cache as the page cache is
>>> also trying to grow. Afterwards we shrink the balloon back to zero (so
>>> total deflate = total inflate).
>>>
>>> Without patch (kernel 4.19.0-5):
>>> Inflation never reaches the target until we stop the "cat file >
>>> /dev/null" process. Total inflation time was 542 seconds. The longest
>>> period that made no net forward progress was 315 seconds (see attached
>>> graph).
>>> Result of "grep balloon /proc/vmstat" after the test:
>>> balloon_inflate 154828377
>>> balloon_deflate 154828377
>>>
>>> With patch (kernel 5.6.0-rc4+):
>>> Total inflation duration was 63 seconds. No deflate-queue activity
>>> occurs when pressuring the page-cache.
>>> Result of "grep balloon /proc/vmstat" after the test:
>>> balloon_inflate 12968539
>>> balloon_deflate 12968539
>>>
>>> Conclusion: This patch fixes the issue. In the test it reduced
>>> inflate/deflate activity by 12x, and reduced inflation time by 8.6x.
>>> But more importantly, if we hadn't killed the "grep balloon
>>> /proc/vmstat" process then, without the patch, the inflation process
>>> would never reach the target.
>>>
>>> Attached is a png of a graph showing the problematic behavior without
>>> this patch. It shows deflate-queue activity increasing linearly while
>>> balloon size stays constant over the course of more than 8 minutes of
>>> the test.
>>
>> Thanks a lot for the extended test!
>
>
> Given we shipped this for a long time, I think the best way
> to make progress is to merge 1/3, 2/3 right now, and 3/3
> in the next release.
Agreed.
--
Thanks,
David / dhildenb
Powered by blists - more mailing lists