linux-kernel - Re: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPcyv4iNXeAsDkhyC6KxNxWYMAPUKt7gPneMDUu97aL6ke7RLg@mail.gmail.com>
Date:   Mon, 23 Jan 2017 10:31:20 -0800
From:   Dan Williams <dan.j.williams@...el.com>
To:     Christoph Hellwig <hch@....de>
Cc:     Matthew Wilcox <mawilcox@...rosoft.com>,
        "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
        Tony Luck <tony.luck@...el.com>, Jan Kara <jack@...e.cz>,
        Toshi Kani <toshi.kani@....com>,
        Mike Snitzer <snitzer@...hat.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "x86@...nel.org" <x86@...nel.org>, Jeff Moyer <jmoyer@...hat.com>,
        Jens Axboe <axboe@...com>,
        "dm-devel@...hat.com" <dm-devel@...hat.com>,
        Ingo Molnar <mingo@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        "H. Peter Anvin" <hpa@...or.com>,
        "linux-fsdevel@...r.kernel.org" <linux-fsdevel@...r.kernel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Ross Zwisler <ross.zwisler@...ux.intel.com>
Subject: Re: [PATCH 00/13] dax, pmem: move cpu cache maintenance to libnvdimm

On Mon, Jan 23, 2017 at 10:03 AM, Christoph Hellwig <hch@....de> wrote:
> On Mon, Jan 23, 2017 at 09:14:04AM -0800, Dan Williams wrote:
>> The use case that we have now is distinguishing volatile vs persistent
>> memory (brd vs pmem).
>
> brd is a development tool, so until we have other reasons for this
> abstraction (which I'm pretty sure will show up rather sooner than later)
> I would not worry about it too much.

By "volatile" I also meant cases where pmem is fronting volatile
memory, or more importantly when the platform has otherwise arranged
for cpu caches to be flushed on a power loss event like I believe some
existing storage appliances do.

>> I took a look at mtd layering approach and the main difference is that
>> layers above the block layer do not appear to know anything about mtd
>> specifics.
>
> Or the block layer itself for that matter.  And that's exactly where
> I want DAX to be in the future.
>
>> For fs/dax.c we currently need some path to retrieve a dax
>> anchor object through the block device.
>
> We have a need to retreiver the anchor object.  We currently do it
> though the block layer for historical reasons, but it doesn't have
> to be that way.
>
>> > In the longer run I like your dax_operations, but they need to be
>> > separate from the block layer.
>>
>> I'll move them from block_device_operations to dax data hanging off of
>> the bdev_inode, or is there a better way to go from bdev-to-dax?
>
> I don't think that's any better.  What we really want is a way
> to find the underlying persistent memory / DAX / whatever we call
> it node without going through a block device.  E.g. a library function
> to give that object for a given path name, where the path name could
> be either that of the /dev/pmemN or the /dev/daxN device.
>
> If the file system for now still needs a block device as well it
> will only accept the /dev/pmemN name, and open both the low-level
> pmem device and the block device.  Once that file system doesn't
> need block code (and I think we could do that easily for XFS,
> nevermind any new FS) it won't have to deal with the block
> device at all.
>
> pmem.c then becomes a consumer of the dax_ops just like the file system.

Ah ok, I'll take a look at a dax_by_path() capability.