[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190612094634.GA14578@quack2.suse.cz>
Date: Wed, 12 Jun 2019 11:46:34 +0200
From: Jan Kara <jack@...e.cz>
To: Ira Weiny <ira.weiny@...el.com>
Cc: Jeff Layton <jlayton@...nel.org>,
Dan Williams <dan.j.williams@...el.com>,
Jan Kara <jack@...e.cz>, Theodore Ts'o <tytso@....edu>,
Dave Chinner <david@...morbit.com>,
Matthew Wilcox <willy@...radead.org>,
linux-xfs@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
John Hubbard <jhubbard@...dia.com>,
Jérôme Glisse <jglisse@...hat.com>,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-nvdimm@...ts.01.org, linux-ext4@...r.kernel.org,
linux-mm@...ck.org
Subject: Re: [PATCH RFC 02/10] fs/locks: Export F_LAYOUT lease to user space
On Tue 11-06-19 14:38:13, Ira Weiny wrote:
> On Sun, Jun 09, 2019 at 09:00:24AM -0400, Jeff Layton wrote:
> > On Wed, 2019-06-05 at 18:45 -0700, ira.weiny@...el.com wrote:
> > > From: Ira Weiny <ira.weiny@...el.com>
> > >
> > > GUP longterm pins of non-pagecache file system pages (eg FS DAX) are
> > > currently disallowed because they are unsafe.
> > >
> > > The danger for pinning these pages comes from the fact that hole punch
> > > and/or truncate of those files results in the pages being mapped and
> > > pinned by a user space process while DAX has potentially allocated those
> > > pages to other processes.
> > >
> > > Most (All) users who are mapping FS DAX pages for long term pin purposes
> > > (such as RDMA) are not going to want to deallocate these pages while
> > > those pages are in use. To do so would mean the application would lose
> > > data. So the use case for allowing truncate operations of such pages
> > > is limited.
> > >
> > > However, the kernel must protect itself and users from potential
> > > mistakes and/or malicious user space code. Rather than disabling long
> > > term pins as is done now. Allow for users who know they are going to
> > > be pinning this memory to alert the file system of this intention.
> > > Furthermore, allow users to be alerted such that they can react if a
> > > truncate operation occurs for some reason.
> > >
> > > Example user space pseudocode for a user using RDMA and wanting to allow
> > > a truncate would look like this:
> > >
> > > lease_break_sigio_handler() {
> > > ...
> > > if (sigio.fd == rdma_fd) {
> > > complete_rdma_operations(...);
> > > ibv_dereg_mr(mr);
> > > close(rdma_fd);
> > > fcntl(rdma_fd, F_SETLEASE, F_UNLCK);
> > > }
> > > }
> > >
> > > setup_rdma_to_dax_file() {
> > > ...
> > > rdma_fd = open(...)
> > > fcntl(rdma_fd, F_SETLEASE, F_LAYOUT);
> >
> > I'm not crazy about this interface. F_LAYOUT doesn't seem to be in the
> > same category as F_RDLCK/F_WRLCK/F_UNLCK.
> >
> > Maybe instead of F_SETLEASE, this should use new
> > F_SETLAYOUT/F_GETLAYOUT cmd values? There is nothing that would prevent
> > you from setting both a lease and a layout on a file, and indeed knfsd
> > can set both.
> >
> > This interface seems to conflate the two.
>
> I've been feeling the same way. This is why I was leaning toward a new lease
> type. I called it "F_LONGTERM" but the name is not important.
>
> I think the concept of adding "exclusive" to the layout lease can fix this
> because the NFS lease is non-exclusive where the user space one (for the
> purpose of GUP pinning) would need to be.
>
> FWIW I have not worked out exactly what this new "exclusive" code will look
> like. Jan said:
>
> "There actually is support for locks that are not broken after given
> timeout so there shouldn't be too many changes need."
>
> But I'm not seeing that for Lease code. So I'm working on something for the
> lease code now.
Yeah, sorry for misleading you. Somehow I thought that if lease_break_time
== 0, we will wait indefinitely but when checking the code again, that
doesn't seem to be the case.
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists