linux-kernel - Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cache control

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <4C18B7D6.5070300@redhat.com>
Date:	Wed, 16 Jun 2010 14:39:02 +0300
From:	Avi Kivity <avi@...hat.com>
To:	Dave Hansen <dave@...ux.vnet.ibm.com>
CC:	balbir@...ux.vnet.ibm.com, kvm <kvm@...r.kernel.org>,
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC/T/D][PATCH 2/2] Linux/Guest cooperative unmapped page cache
 control

On 06/15/2010 05:47 PM, Dave Hansen wrote:
>
>> That's a bug that needs to be fixed.  Eventually the host will come
>> under pressure and will balloon the guest.  If that kills the guest, the
>> ballooning is not effective as a host memory management technique.
>>      
> I'm not convinced that it's just a bug that can be fixed.  Consider a
> case where a host sees a guest with 100MB of free memory at the exact
> moment that a database app sees that memory.  The host tries to balloon
> that memory away at the same time that the app goes and allocates it.
> That can certainly lead to an OOM very quickly, even for very small
> amounts of memory (much less than 100MB).  Where's the bug?
>
> I think the issues are really fundamental to ballooning.
>    

There are two issues involved.

One is, can the kernel accurately determine the amount of memory it 
needs to work?  We have resources such as RAM and swap.  We have 
liabilities in the form of swappable userspace memory, mlocked userspace 
memory, kernel memory to support these, and various reclaimable and 
non-reclaimable kernel caches.  Can we determine the minimum amount of 
RAM to support are workload at a point in time?

If we had this, we could modify the balloon to refuse to balloon if it 
takes the kernel beneath the minimum amount of RAM needed.

In fact, this is similar to allocating memory with overcommit_memory = 
0.  The difference is the balloon allocates mlocked memory, while normal 
allocations can be charged against swap.  But fundamentally it's the same.

>>> If all the guests do this, then it leaves that much more free memory on
>>> the host, which can be used flexibly for extra host page cache, new
>>> guests, etc...
>>>        
>> If the host detects lots of pagecache misses it can balloon guests
>> down.  If pagecache is quiet, why change anything?
>>      
> Page cache misses alone are not really sufficient.  This is the classic
> problem where we try to differentiate streaming I/O (which we can't
> effectively cache) from I/O which can be effectively cached.
>    

True.  Random I/O across a very large dataset is also difficult to cache.

>> If the host wants to start new guests, it can balloon guests down.  If
>> no new guests are wanted, why change anything?
>>      
> We're talking about an environment which we're always trying to
> optimize.  Imagine that we're always trying to consolidate guests on to
> smaller numbers of hosts.  We're effectively in a state where we
> _always_ want new guests.
>    

If this came at no cost to the guests, you'd be right.  But at some 
point guest performance will be hit by this, so the advantage gained 
from freeing memory will be balanced by the disadvantage.

Also, memory is not the only resource.  At some point you become cpu 
bound; at that point freeing memory doesn't help and in fact may 
increase your cpu load.

-- 
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/