linux-kernel - Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190207172405.GY21860@bombadil.infradead.org>
Date:   Thu, 7 Feb 2019 09:24:05 -0800
From:   Matthew Wilcox <willy@...radead.org>
To:     Doug Ledford <dledford@...hat.com>
Cc:     Dan Williams <dan.j.williams@...el.com>,
        Jason Gunthorpe <jgg@...pe.ca>,
        Dave Chinner <david@...morbit.com>,
        Christopher Lameter <cl@...ux.com>, Jan Kara <jack@...e.cz>,
        Ira Weiny <ira.weiny@...el.com>,
        lsf-pc@...ts.linux-foundation.org,
        linux-rdma <linux-rdma@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        John Hubbard <jhubbard@...dia.com>,
        Jerome Glisse <jglisse@...hat.com>,
        Michal Hocko <mhocko@...nel.org>
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving
 longterm-GUP usage by RDMA

On Thu, Feb 07, 2019 at 11:25:35AM -0500, Doug Ledford wrote:
> * Really though, as I said in my email to Tom Talpey, this entire
> situation is simply screaming that we are doing DAX networking wrong. 
> We shouldn't be writing the networking code once in every single
> application that wants to do this.  If we had a memory segment that we
> shared from server to client(s), and in that memory segment we
> implemented a clustered filesystem, then applications would simply mmap
> local files and be done with it.  If the file needed to move, the kernel
> would update the mmap in the application, done.  If you ask me, it is
> the attempt to do this the wrong way that is resulting in all this
> heartache.  That said, for today, my recommendation would be to require
> ODP hardware for XFS filesystem with the DAX option, but allow ext2
> filesystems to mount DAX filesystems on non-ODP hardware, and go in and
> modify the ext2 filesystem so that on DAX mounts, it disables hole punch
> and ftrunctate any time they would result in the forced removal of an
> established mmap.

I agree that something's wrong, but I think the fundamental problem is
that there's no concept in RDMA of having an STag for storage rather
than for memory.

Imagine if we could associate an STag with a file descriptor on the
server.  The client could then perform an RDMA to that STag.  On the
server, we'd need lots of smarts in the card and in the OS to know how
to treat that packet on arrival -- depending on what the file descriptor
referred to, it might only have to write into the page cache, or it
might set up an NVMe DMA, or it might resolve the underlying physical
address and DMA directly to an NV-DIMM.