linux-kernel - Re: NVM Mapping API

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4FB81C6C.5080409@ontolab.com>
Date:	Sun, 20 May 2012 00:19:24 +0200
From:	Christian Stroetmann <stroetmann@...olab.com>
To:	Christian Stroetmann <stroetmann@...olab.com>
CC:	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: NVM Mapping API

On We, May 16, 2012 at 21:58, Christian Stroetmann wrote:
> On We, May 16, 2012 at 19:35,  Matthew Wilcox wrote:
>> On Wed, May 16, 2012 at 10:52:00AM +0100, James Bottomley wrote:
>>> On Tue, 2012-05-15 at 09:34 -0400, Matthew Wilcox wrote:
>>>> There are a number of interesting non-volatile memory (NVM) 
>>>> technologies
>>>> being developed.  Some of them promise DRAM-comparable latencies and
>>>> bandwidths.  At Intel, we've been thinking about various ways to 
>>>> present
>>>> those to software.  This is a first draft of an API that supports the
>>>> operations we see as necessary.  Patches can follow easily enough once
>>>> we've settled on an API.
>>> If we start from first principles, does this mean it's usable as DRAM?
>>> Meaning do we even need a non-memory API for it?  The only difference
>>> would be that some pieces of our RAM become non-volatile.
>> I'm not talking about a specific piece of technology, I'm assuming that
>> one of the competing storage technologies will eventually make it to
>> widespread production usage.  Let's assume what we have is DRAM with a
>> giant battery on it.
> Our ST-RAM (see [1] for the original source of its description) is a 
> concept based on the combination of a writable volatile Random-Access 
> Memory (RAM) chip and a capacitor.
[...]
> Boaz asked: "What is the difference from say a PCIE DRAM card with 
> battery"? It sits in the RAM slot.
>
>
>>
>> So, while we can use it just as DRAM, we're not taking advantage of the
>> persistent aspect of it if we don't have an API that lets us find the
>> data we wrote before the last reboot.  And that sounds like a filesystem
>> to me.
>
> No and yes.
> 1. In the first place it is just a normal DRAM.
> 2. But due to its nature it has also many aspects of a flash memory.
> So the use case is for point
> 1. as a normal RAM module,
> and for point
> 2. as a file system,
> which again can be used
> 2.1 directly by the kernel as a normal file system,
> 2.2 directly by the kernel by the PRAMFS
> 2.3 by the proposed NVMFS, maybe as a shortcut for optimization,
> and
> 2.4 from the userspace, most potentially by using the standard VFS. 
> Maybe this version 2.4 is the same as point 2.2.
>
>>> Or is there some impediment (like durability, or degradation on 
>>> rewrite)
>>> which makes this unsuitable as a complete DRAM replacement?
>> The idea behind using a different filesystem for different NVM types is
>> that we can hide those kinds of impediments in the filesystem.  By the
>> way, did you know DRAM degrades on every write?  I think it's on the
>> order of 10^20 writes (and CPU caches hide many writes to heavily-used
>> cache lines), so it's a long way away from MLC or even SLC rates, but
>> it does exist.
>
> As I said before, a filesystem for the different NVM types would not 
> be enough. These things are more complex due the possibility that they 
> can be used very flexbily.
>
>>
>>> Alternatively, if it's not really DRAM, I think the UNIX file
>>> abstraction makes sense (it's a piece of memory presented as something
>>> like a filehandle with open, close, seek, read, write and mmap), but
>>> it's less clear that it should be an actual file system.  The reason is
>>> that to present a VFS interface, you have to already have fixed the
>>> format of the actual filesystem on the memory because we can't nest
>>> filesystems (well, not without doing artificial loopbacks).  Again, 
>>> this
>>> might make sense if there's some architectural reason why the flash
>>> region has to have a specific layout, but your post doesn't shed any
>>> light on this.
>> We can certainly present a block interface to allow using unmodified
>> standard filesystems on top of chunks of this NVM.  That's probably not
>> the optimum way for a filesystem to use it though; there's really no
>> point in constructing a bio to carry data down to a layer that's simply
>> going to do a memcpy().
>> -- 
>
> I also saw the use cases by Boaz that are
> Journals of other FS, which could be done on top of the NVMFS for 
> example, but is not really what I have in mind, and
> Execute in place, for which an Elf loader feature is needed. 
> Obviously, this use case was envisioned by me as well.
>
> For direct rebooting the checkpointing of standard RAM is also a 
> needed function. The decision what is trashed and what is marked as 
> persistent RAM content has to be made by the RAM experts of the Linux 
> developers or the user. I even think that this is a special use case 
> on its own with many options.
>
Because it is now about 1 year ago when I played around with the 
conceptual hardware aspects of anUninterruptible Power RAM (UPRAM) like 
the ST-RAM, I looked in more detail at the software side yesterday and 
today. So let me please add my first use case that I had in mind last 
year and coined now:
Hybrid Hibernation (HyHi) or alternatively Suspend-to-NVM,
which is similar to hybrid sleep and hibernation, but also differs a 
little bit due to the uninterruptible power feature.

But as it can be seen easily here again, even with this 1 use case exist 
two paths to handle the NVM that are as:
1. RAM and
2. FS,
so that it leads a further time to the discussion, if hibernation should 
be a kernel or a user space function (see [1] and [2] for more 
information related with the discussion about uswsup (userspace software 
suspend) and suspend2, and [3] for uswsup and [4] for TuxOnIce).

Eventually, there is an interest to reuse some functions or code.



Have fun in the sun
C. Stroetmann
> [1] ST-RAM www.ontonics.com/innovation/pipeline.htm#st-ram
>
[1] LKML: Pavel Machek: RE: suspend2 merge lkml.org/lkml/2007/4/24/405
[2] KernelTrap: Linux: Reviewung Suspend2 kerneltrap.org/node/6766
[3] suspend.sourceforge.net
[4] tuxonice.net
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/