lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 13 Aug 2009 23:16:26 +0200
From:	Johannes Weiner <hannes@...xchg.org>
To:	Avi Kivity <avi@...hat.com>
Cc:	Rik van Riel <riel@...hat.com>,
	Wu Fengguang <fengguang.wu@...el.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	"Dike, Jeffrey G" <jeffrey.g.dike@...el.com>,
	"Yu, Wilfred" <wilfred.yu@...el.com>,
	"Kleen, Andi" <andi.kleen@...el.com>,
	Hugh Dickins <hugh.dickins@...cali.co.uk>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Christoph Lameter <cl@...ux-foundation.org>,
	Mel Gorman <mel@....ul.ie>,
	LKML <linux-kernel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>
Subject: Re: [RFC] respect the referenced bit of KVM guest pages?

On Thu, Aug 13, 2009 at 10:12:01PM +0300, Avi Kivity wrote:
> On 08/13/2009 07:26 PM, Rik van Riel wrote:
> >>Why do we need to ignore the referenced bit in such cases?  To avoid 
> >>overscanning?
> >
> >
> >Because swapping out anonymous pages tends to be a relatively
> >rare operation, we'll have many gigabytes of anonymous pages
> >that all have the referenced bit set (because there was lots
> >of time between swapout bursts).
> >
> >Ignoring the referenced bit on active anon pages makes no
> >difference on these systems, because all active anon pages
> >have the referenced bit set, anyway.
> >
> >All we need to do is put the pages on the inactive list and
> >give them a chance to get referenced.
> >
> >However, on smaller systems (and cgroups!), the speed at
> >which we can do pageout IO is larger, compared to the amount
> >of memory.  This means we can cycle through the pages more
> >quickly and we may want to count references on the active
> >list, too.
> >
> >Yes, on smaller systems we'll also often end up with bursty
> >swapout loads and all pages referenced - but since we have
> >fewer pages to begin with, it won't hurt as much.
> >
> >I suspect that an inactive_ratio of 3 or 4 might make a
> >good cutoff value.
> >
> 
> Thanks for the explanation.  I think my earlier idea of
> 
> - do not ignore the referenced bit
> - if you see a run of N pages which all have the referenced bit set, do 
> swap one
> 
> has merit.  It means we cycle more quickly (by a factor of N) through 
> the list, looking for unreferenced pages.  If we don't find any we've 
> spent a some more cpu, but if we do find an unreferenced page, we win by 
> swapping a truly unneeded page.

But it also means destroying the LRU order.

Okay, we ignore the referenced bit, but we keep LRU buddies together
which then get reactivated together as well, if they are indeed in
active use.

I could imagine the VM going nuts when you separate them by a
predicate that is rather unrelated to the pages's actual usage
patterns.

After all, the list order is the primary input to selecting pages for
eviction.

It would need actual testing, of course, but I bet Rik's idea of using
the referenced bit always or never is going to show better results.

	Hannes
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ