linux-kernel - Re: Tmem [PATCH 0/5] (Take 3): Transcendent memory

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <4B39646A.3080007@vflare.org>
Date:	Tue, 29 Dec 2009 07:37:38 +0530
From:	Nitin Gupta <ngupta@...are.org>
To:	Dan Magenheimer <dan.magenheimer@...cle.com>
CC:	Pavel Machek <pavel@....cz>, Nick Piggin <npiggin@...e.de>,
	Andrew Morton <akpm@...ux-foundation.org>, jeremy@...p.org,
	xen-devel@...ts.xensource.com, tmem-devel@....oracle.com,
	Rusty Russell <rusty@...tcorp.com.au>,
	Rik van Riel <riel@...hat.com>, dave.mccracken@...cle.com,
	sunil.mushran@...cle.com, Avi Kivity <avi@...hat.com>,
	Schwidefsky <schwidefsky@...ibm.com>,
	Balbir Singh <balbir@...ux.vnet.ibm.com>,
	Marcelo Tosatti <mtosatti@...hat.com>,
	Alan Cox <alan@...rguk.ukuu.org.uk>, chris.mason@...cle.com,
	linux-mm <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>
Subject: Re: Tmem [PATCH 0/5] (Take 3): Transcendent memory

On 12/28/2009 09:27 PM, Dan Magenheimer wrote:
> 
>> From: Pavel Machek [mailto:pavel@....cz]

>>> I'm definitely OK with exploring alternatives.  I just think that
>>> existing kernel mechanisms are very firmly rooted in the notion
>>> that either the kernel owns the memory/cache or an asynchronous
>>> device owns it.  Tmem falls somewhere in between and is very
>>
>> Well... compcache seems to be very similar to preswap: in preswap case
>> you don't know if hypervisor will have space, in ramzswap you don't
>> know if data are compressible.
> 
> Hi Pavel --
> 
> Yes there are definitely similarities too.  In fact, I started
> prototyping preswap (now called frontswap) with Nitin's
> compcache code.  IIRC I ran into some problems with compcache's
> difficulties in dealing with failed "puts" due to dynamic
> changes in size of hypervisor-available-memory.
> 
> Nitin may have addressed this in later versions of ramzswap.
> 

Any kind of swap device that works entirely within guest
(or in native case), will always have problems with any write(put)
failure -- we want to reclaim a page but due to write failure, we can't. Problem!
So, ramzswap also cannot afford to have lot of write failures.

However, the story is different when ramzswap is "virtualization aware".
In this case, we can surely afford to have any numnber of "put" failures
to hypervisor. When this put fails, we will either compress the page and
keep it in guest memory itself or forward it to ramzswap backing swap
device (if present).

Another side point is that we can achieve all this with ramzswap approach
of virtual block devices without any kernel changes as everything is a module.

> One feature of frontswap which is different than ramzswap is
> that frontswap acts as a "fronting store" for all configured
> swap devices, including SAN/NAS swap devices.  It doesn't
> need to be separately configured as a "highest priority" swap
> device.  In many installations and depending on how ramzswap
> is configured, this difference probably doesn't make much
> difference though.
> 

Having a frontswap layer over *every* swap might not be desirable. I think such
things should be completely out of way when not desired. This was one the primary
reasons to have virtual block device approach for ramzswap. You can create any number
of such devices (/dev/ramzswap{0,1,2...}) with each having separate backing device (optional),
memory pools, buffers etc. which adds additional flexibility and helps with scalability.

On a downside however, as you pointed out, managing all this can be a problem for sysadmins.
To ease this, some userspace magic can help which will dynamically manage these virtual disks,
though I have not yet thought much in this direction.

Thanks,
Nitin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/