[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BD1B626.7020702@redhat.com>
Date: Fri, 23 Apr 2010 18:00:54 +0300
From: Avi Kivity <avi@...hat.com>
To: Dan Magenheimer <dan.magenheimer@...cle.com>
CC: linux-kernel@...r.kernel.org, linux-mm@...ck.org, jeremy@...p.org,
hugh.dickins@...cali.co.uk, ngupta@...are.org, JBeulich@...ell.com,
chris.mason@...cle.com, kurt.hackel@...cle.com,
dave.mccracken@...cle.com, npiggin@...e.de,
akpm@...ux-foundation.org, riel@...hat.com
Subject: Re: Frontswap [PATCH 0/4] (was Transcendent Memory): overview
On 04/23/2010 05:52 PM, Avi Kivity wrote:
>
> I see. So why not implement this as an ordinary swap device, with a
> higher priority than the disk device? this way we reuse an API and
> keep things asynchronous, instead of introducing a special purpose API.
>
Ok, from your original post:
> An "init" prepares the pseudo-RAM to receive frontswap pages and returns
> a non-negative pool id, used for all swap device numbers (aka "type").
> A "put_page" will copy the page to pseudo-RAM and associate it with
> the type and offset associated with the page. A "get_page" will copy the
> page, if found, from pseudo-RAM into kernel memory, but will NOT remove
> the page from pseudo-RAM. A "flush_page" will remove the page from
> pseudo-RAM and a "flush_area" will remove ALL pages associated with the
> swap type (e.g., like swapoff) and notify the pseudo-RAM device to refuse
> further puts with that swap type.
>
> Once a page is successfully put, a matching get on the page will always
> succeed. So when the kernel finds itself in a situation where it needs
> to swap out a page, it first attempts to use frontswap. If the put returns
> non-zero, the data has been successfully saved to pseudo-RAM and
> a disk write and, if the data is later read back, a disk read are avoided.
> If a put returns zero, pseudo-RAM has rejected the data, and the page can
> be written to swap as usual.
>
> Note that if a page is put and the page already exists in pseudo-RAM
> (a "duplicate" put), either the put succeeds and the data is overwritten,
> or the put fails AND the page is flushed. This ensures stale data may
> never be obtained from pseudo-RAM.
>
Looks like "init" == open, "put_page" == write, "get_page" == read,
"flush_page|flush_area" == trim. The only difference seems to be that
an overwriting put_page may fail. Doesn't seem to be much of a win,
since a guest can simply avoid issuing the duplicate put_page, so the
hypervisor is still committed to holding this memory for the guest.
--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists