linux-kernel - Re: [LSF/MM TOPIC] Discuss least bad options for resolving longterm-GUP usage by RDMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAPcyv4jHjeJxmHMyrbRhg9oeaLK5WbZm-qu1HywjY7bF2DwiDg@mail.gmail.com>
Date:   Mon, 11 Feb 2019 13:02:37 -0800
From:   Dan Williams <dan.j.williams@...el.com>
To:     Jason Gunthorpe <jgg@...pe.ca>
Cc:     Matthew Wilcox <willy@...radead.org>,
        Ira Weiny <ira.weiny@...el.com>, Jan Kara <jack@...e.cz>,
        Dave Chinner <david@...morbit.com>,
        Christopher Lameter <cl@...ux.com>,
        Doug Ledford <dledford@...hat.com>,
        lsf-pc@...ts.linux-foundation.org,
        linux-rdma <linux-rdma@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        John Hubbard <jhubbard@...dia.com>,
        Jerome Glisse <jglisse@...hat.com>,
        Michal Hocko <mhocko@...nel.org>
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving
 longterm-GUP usage by RDMA

On Mon, Feb 11, 2019 at 12:49 PM Jason Gunthorpe <jgg@...pe.ca> wrote:
>
> On Mon, Feb 11, 2019 at 11:58:47AM -0800, Dan Williams wrote:
> > On Mon, Feb 11, 2019 at 10:40 AM Matthew Wilcox <willy@...radead.org> wrote:
> > >
> > > On Mon, Feb 11, 2019 at 11:26:49AM -0700, Jason Gunthorpe wrote:
> > > > On Mon, Feb 11, 2019 at 10:19:22AM -0800, Ira Weiny wrote:
> > > > > What if user space then writes to the end of the file with a regular write?
> > > > > Does that write end up at the point they truncated to or off the end of the
> > > > > mmaped area (old length)?
> > > >
> > > > IIRC it depends how the user does the write..
> > > >
> > > > pwrite() with a given offset will write to that offset, re-extending
> > > > the file if needed
> > > >
> > > > A file opened with O_APPEND and a write done with write() should
> > > > append to the new end
> > > >
> > > > A normal file with a normal write should write to the FD's current
> > > > seek pointer.
> > > >
> > > > I'm not sure what happens if you write via mmap/msync.
> > > >
> > > > RDMA is similar to pwrite() and mmap.
> > >
> > > A pertinent point that you didn't mention is that ftruncate() does not change
> > > the file offset.  So there's no user-visible change in behaviour.
> >
> > ...but there is. The blocks you thought you freed, especially if the
> > system was under -ENOSPC pressure, won't actually be free after the
> > successful ftruncate().
>
> They won't be free after something dirties the existing mmap either.
>
> Blocks also won't be free if you unlink a file that is currently still
> open.
>
> This isn't really new behavior for a FS.

An mmap write after a fault due to a hole punch is free to trigger
SIGBUS if the subsequent page allocation fails. So no, I don't see
them as the same unless you're allowing for the holder of the MR to
receive a re-fault failure.