[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 12 Dec 2019 08:18:46 +0100
From: Jürgen Groß <jgross@...e.com>
To: Nicholas Tsirakis <niko.tsirakis@...il.com>,
boris.ostrovsky@...cle.com
Cc: xen-devel <xen-devel@...ts.xenproject.org>,
linux-kernel@...r.kernel.org
Subject: Re: [BUG] Xen-ballooned memory never returned to domain after
partial-free
On 11.12.19 23:08, Nicholas Tsirakis wrote:
> Hello,
>
> The issue I'm seeing is that pages of previously-xenballooned memory are getting
> trapped in the balloon on free, specifically when they are free'd in batches
> (i.e. not all at once). The first batch is restored to the domain properly, but
> subsequent frees are not.
>
> Truthfully I'm not sure if this is a bug or not, but the behavior I'm seeing
> doesn't seem to make sense. Note that this "bug" is in the balloon driver, but
> the behavior is seen when using the gnttab API, which utilizes the balloon in
> the background.
>
> ------------------------------------------------------------------------------
>
> This issue is better illustrated as an example, seen below. Note that the file
> in question is drivers/xen/balloon.c:
>
> Kernel version: 4.19.*, code seems consistent on master as well
> Relevant configs:
> - CONFIG_MEMORY_HOTPLUG not set
> - CONFIG_XEN_BALLOON_MEMORY_HOTPLUG not set
>
> * current_pages = # of pages assigned to domain
> * target_pages = # of pages we want assigned to domain
> * credit = target - current
>
> Start with current_pages/target_pages = 20 pages
>
> 1. alloc 5 pages with gnttab_alloc_pages(). current_pages = 15, credit = 5.
> 2. alloc 3 pages with gnttab_alloc_pages(). current_pages = 12, credit = 8.
> 3. some time later, free the last 3 pages with gnttab_free_pages().
> 4. 3 pages go back to balloon and balloon worker is scheduled since credit > 0.
> * Relevant part of balloon worker shown below:
>
> do {
> ...
>
> credit = current_credit();
>
> if (credit > 0) {
> if (balloon_is_inflated())
> state = increase_reservation(credit);
> else
> state = reserve_additional_memory();
> }
>
> ...
>
> } while (credit && state == BP_DONE);
>
> 5. credit > 0 and the balloon contains 3 pages, so run increase_reservation. 3
> pages are restored to domain, correctly. current_pages = 15, credit = 5.
> 6. at this point credit is still > 0, so we loop again.
> 7. this time, the balloon has 0 pages, so we call reserve_additional_memory,
> seen below. note that CONFIG_XEN_BALLOON_MEMORY_HOTPLUG is disabled, so this
> funciton is very sparse.
>
> static enum bp_state reserve_additional_memory(void)
> {
> balloon_stats.target_pages = balloon_stats.current_pages;
> return BP_ECANCELED;
> }
>
> 8. now target = current = 15, which drops our credit down to 0.
And I think this is the problem. We want here:
balloon_stats.target_pages = balloon_stats.current_pages +
balloon_stats.target_unpopulated;
This should fix it. Thanks for the detailed analysis!
Does the attached patch work for you?
And are you fine with the "Reported-by:" added?
Juergen
View attachment "0001-xen-balloon-fix-ballooned-page-accounting-without-ho.patch" of type "text/x-patch" (1288 bytes)
Powered by blists - more mailing lists