lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1217230570.6331.6.camel@twins>
Date:	Mon, 28 Jul 2008 09:36:10 +0200
From:	Peter Zijlstra <peterz@...radead.org>
To:	Jeremy Fitzhardinge <jeremy@...p.org>
Cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	Rik van Riel <riel@...hat.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Virtualization Mailing List <virtualization@...ts.osdl.org>,
	Linux Memory Management List <linux-mm@...ck.org>
Subject: Re: How to get a sense of VM pressure

On Fri, 2008-07-25 at 10:55 -0700, Jeremy Fitzhardinge wrote:
> I'm thinking about ways to improve the Xen balloon driver.  This is the 
> driver which allows the guest domain to expand or contract by either 
> asking for more memory from the hypervisor, or giving unneeded memory 
> back.  From the kernel's perspective, it simply looks like a driver 
> which allocates and frees pages; when it allocates memory it gives the 
> underlying physical page back to the hypervisor.  And conversely, when 
> it gets a page from the hypervisor, it glues it under a given pfn and 
> releases that page back to the kernel for reuse.
> 
> At the moment it's very dumb, and is pure mechanism.  It's told how much 
> memory to target, and it either allocates or frees memory until the 
> target is reached.  Unfortunately, that means if it's asked to shrink to 
> an unreasonably small size, it will do so without question, killing the 
> domain in a thrash-storm in the process.
> 
> There are several problems:
> 
>    1. it doesn't know what a reasonable lower limit is, and
>    2. it doesn't moderate the rate of shrinkage to give the rest of the
>       VM time to adjust to having less memory (by paging out, dropping
>       inactive, etc)
> 
> And possibly the third point is that the only mechanism it has for 
> applying memory pressure to the system is by allocating memory.  It 
> allocates with (GFP_HIGHUSER | __GFP_NOWARN | __GFP_NORETRY | 
> __GFP_NOMEMALLOC), trying not to steal memory away from things that 
> really need it.  But in practice, it can still easy drive the machine 
> into a massive unrecoverable swap storm.
> 
> So I guess what I need is some measurement of "memory use" which is 
> perhaps akin to a system-wide RSS; a measure of the number of pages 
> being actively used, that if non-resident would cause a large amount of 
> paging.  If you shrink the domain down to that number of pages + some 
> padding (x%?), then the system will run happily in a stable state.  If 
> that number increases, then the system will need new memory soon, to 
> stop it from thrashing.  And if that number goes way below the domain's 
> actual memory allocation, then it has "too much" memory.
> 
> Is this what "Active" accounts for?  Is Active just active 
> usermode/pagecache pages, or does it also include kernel allocations?  
> Presumably Inactive Clean memory can be freed very easily with little 
> impact on the system, Inactive Dirty memory isn't needed but needs IO to 
> free; is there some way to measure how big each class of memory is?
> 
> If you wanted to apply gentle memory pressure on the system to attempt 
> to accelerate freeing memory, how would you go about doing that?  Would 
> simply allocating memory at a controlled rate achieve it?
> 
> I guess it also gets more complex when you bring nodes and zones into 
> the picture.  Does it mean that this computation would need to be done 
> per node+zone rather than system-wide?
> 
> Or is there some better way to implement all this?

Have a peek at this:

  http://people.redhat.com/~riel/riel-OLS2006.pdf

The refault patches have been posted several times, but nobody really
tried to use them for your problem.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ