linux-kernel - Re: [PATCH v2 0/5] fs, xfs: block map immutable files for dax, dma-to-storage, and swap

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20170814124059.GC17820@quack2.suse.cz>
Date:   Mon, 14 Aug 2017 14:40:59 +0200
From:   Jan Kara <jack@...e.cz>
To:     Dan Williams <dan.j.williams@...el.com>
Cc:     Christoph Hellwig <hch@....de>,
        "Darrick J. Wong" <darrick.wong@...cle.com>,
        Jan Kara <jack@...e.cz>,
        "linux-nvdimm@...ts.01.org" <linux-nvdimm@...ts.01.org>,
        Dave Chinner <david@...morbit.com>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        linux-xfs@...r.kernel.org, Jeff Moyer <jmoyer@...hat.com>,
        Alexander Viro <viro@...iv.linux.org.uk>,
        Andy Lutomirski <luto@...nel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        Ross Zwisler <ross.zwisler@...ux.intel.com>,
        Linux API <linux-api@...r.kernel.org>,
        Peter Zijlstra <peterz@...radead.org>
Subject: Re: [PATCH v2 0/5] fs, xfs: block map immutable files for dax,
 dma-to-storage, and swap

On Sun 13-08-17 13:31:45, Dan Williams wrote:
> On Sun, Aug 13, 2017 at 2:24 AM, Christoph Hellwig <hch@....de> wrote:
> > Thay being said I think we absolutely should support RDMA memory
> > registrations for DAX mappings.  I'm just not sure how S_IOMAP_IMMUTABLE
> > helps with that.  We'll want a MAP_SYNC | MAP_POPULATE to make sure
> > all the blocks are polulated and all ptes are set up.  Second we need
> > to make sure get_user_page works, which for now means we'll need a
> > struct page mapping for the region (which will be really annoying
> > for PCIe mappings, like the upcoming NVMe persistent memory region),
> > and we need to gurantee that the extent mapping won't change while
> > the get_user_pages holds the pages inside it.  I think that is true
> > due to side effects even with the current DAX code, but we'll need to
> > make it explicit.  And maybe that's where we need to converge -
> > "sealing" the extent map makes sense as such a temporary measure
> > that is not persisted on disk, which automatically gets released
> > when the holding process exits, because we sort of already do this
> > implicitly.  It might also make sense to have explicitl breakable
> > seals similar to what I do for the pNFS blocks kernel server, as
> > any userspace RDMA file server would also need those semantics.
> 
> Ok, how about a MAP_DIRECT flag that arranges for faults to that range to:
> 
>     1/ only succeed if the fault can be satisfied without page cache
> 
>     2/ only install a pte for the fault if it can do so without
> triggering block map updates
> 
> So, I think it would still end up setting an inode flag to make
> xfs_bmapi_write() fail while any process has a MAP_DIRECT mapping
> active. However, it would not record that state in the on-disk
> metadata and it would automatically clear at munmap time. That should
> be enough to support the host-persistent-memory, and
> NVMe-persistent-memory use cases (provided we have struct page for
> NVMe). Although, we need more safety infrastructure in the NVMe case
> where we would need to software manage I/O coherence.

Hum, this proposal (and the problems you are trying to deal with) seem very
similar to Peter Zijlstra's mpin() proposal from 2014 [1], just moved to
the DAX area (and so additionally complicated by the fact that filesystems
now have to care). The patch set was not merged due to lack of interest I
think but it looked sensible and the proposed API would make sense for more
stuff than just DAX so maybe it would be better than MAP_DIRECT flag?

[1] https://lwn.net/Articles/600502/

								Honza

-- 
Jan Kara <jack@...e.com>
SUSE Labs, CR