[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPcyv4h1=GTAqHBw+Zsp9eNYR3HFbB_qjmhntwnO-jyGun4QNA@mail.gmail.com>
Date: Wed, 6 Feb 2019 16:22:16 -0800
From: Dan Williams <dan.j.williams@...el.com>
To: Jason Gunthorpe <jgg@...pe.ca>
Cc: Doug Ledford <dledford@...hat.com>,
Dave Chinner <david@...morbit.com>,
Christopher Lameter <cl@...ux.com>,
Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
Ira Weiny <ira.weiny@...el.com>,
lsf-pc@...ts.linux-foundation.org,
linux-rdma <linux-rdma@...r.kernel.org>,
Linux MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
John Hubbard <jhubbard@...dia.com>,
Jerome Glisse <jglisse@...hat.com>,
Michal Hocko <mhocko@...nel.org>,
linux-nvdimm <linux-nvdimm@...ts.01.org>
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving
longterm-GUP usage by RDMA
On Wed, Feb 6, 2019 at 3:41 PM Jason Gunthorpe <jgg@...pe.ca> wrote:
[..]
> > You're describing the current situation, i.e. Linux already implements
> > this, it's called Device-DAX and some users of RDMA find it
> > insufficient. The choices are to continue to tell them "no", or say
> > "yes, but you need to submit to lease coordination".
>
> Device-DAX is not what I'm imagining when I say XFS--.
>
> I mean more like XFS with all features that require rellocation of
> blocks disabled.
>
> Forbidding hold punch, reflink, cow, etc, doesn't devolve back to
> device-dax.
True, not all the way, but the distinction loses significance as you
lose fs features.
Filesystems mark DAX functionality experimental [1] precisely because
it forbids otherwise typical operations that work in the nominal page
cache case. An approach that says "lets cement the list of things a
filesystem or a core-memory-mangement facility can't do because RDMA
finds it awkward" is bad precedent. It's bad precedent because it
abdicates core kernel functionality to userspace and weakens the api
contract in surprising ways.
EBUSY is a horrible status code especially if an administrator is
presented with an emergency situation that a filesystem needs to free
up storage capacity and get established memory registrations out of
the way. The motivation for the current status quo of failing memory
registration for DAX mappings is to help ensure the system does not
get into this situation where forward progress cannot be guaranteed.
[1]: https://lists.01.org/pipermail/linux-nvdimm/2019-February/019884.html
Powered by blists - more mailing lists