lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20100315091720.GC18054@balbir.in.ibm.com>
Date:	Mon, 15 Mar 2010 14:47:20 +0530
From:	Balbir Singh <balbir@...ux.vnet.ibm.com>
To:	Avi Kivity <avi@...hat.com>
Cc:	KVM development list <kvm@...r.kernel.org>,
	Rik van Riel <riel@...riel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH][RF C/T/D] Unmapped page cache control - via boot
 parameter

* Avi Kivity <avi@...hat.com> [2010-03-15 10:27:45]:

> On 03/15/2010 10:07 AM, Balbir Singh wrote:
> >* Avi Kivity<avi@...hat.com>  [2010-03-15 09:48:05]:
> >
> >>On 03/15/2010 09:22 AM, Balbir Singh wrote:
> >>>Selectively control Unmapped Page Cache (nospam version)
> >>>
> >>>From: Balbir Singh<balbir@...ux.vnet.ibm.com>
> >>>
> >>>This patch implements unmapped page cache control via preferred
> >>>page cache reclaim. The current patch hooks into kswapd and reclaims
> >>>page cache if the user has requested for unmapped page control.
> >>>This is useful in the following scenario
> >>>
> >>>- In a virtualized environment with cache!=none, we see
> >>>   double caching - (one in the host and one in the guest). As
> >>>   we try to scale guests, cache usage across the system grows.
> >>>   The goal of this patch is to reclaim page cache when Linux is running
> >>>   as a guest and get the host to hold the page cache and manage it.
> >>>   There might be temporary duplication, but in the long run, memory
> >>>   in the guests would be used for mapped pages.
> >>Well, for a guest, host page cache is a lot slower than guest page cache.
> >>
> >Yes, it is a virtio call away, but is the cost of paying twice in
> >terms of memory acceptable?
> 
> Usually, it isn't, which is why I recommend cache=off.
>

cache=off works for *direct I/O* supported filesystems and my concern is that
one of the side-effects is that idle VM's can consume a lot of memory
(assuming all the memory is available to them). As the number of VM's
grow, they could cache a whole lot of memory. In my experiments I
found that the total amount of memory cached far exceeded the mapped
ratio by a large amount when we had idle VM's. The philosophy of this
patch is to move the caching to the _host_ and let the host maintain
the cache instead of the guest.
 
> >One of the reasons I created a boot
> >parameter was to deal with selective enablement for cases where
> >memory is the most important resource being managed.
> >
> >I do see a hit in performance with my results (please see the data
> >below), but the savings are quite large. The other solution mentioned
> >in the TODOs is to have the balloon driver invoke this path. The
> >sysctl also allows the guest to tune the amount of unmapped page cache
> >if needed.
> >
> >The knobs are for
> >
> >1. Selective enablement
> >2. Selective control of the % of unmapped pages
> 
> An alternative path is to enable KSM for page cache.  Then we have
> direct read-only guest access to host page cache, without any guest
> modifications required.  That will be pretty difficult to achieve
> though - will need a readonly bit in the page cache radix tree, and
> teach all paths to honour it.
> 

Yes, it is, I've taken a quick look. I am not sure if de-duplication
would be the best approach, may be dropping the page in the page cache
might be a good first step. Data consistency would be much easier to
maintain that way, as long as the guest is not writing frequently to
that page, we don't need the page cache in the host.

> -- 
> Do not meddle in the internals of kernels, for they are subtle and quick to panic.
> 

-- 
	Three Cheers,
	Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ