lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190214202622.GB3420@redhat.com>
Date:   Thu, 14 Feb 2019 15:26:22 -0500
From:   Jerome Glisse <jglisse@...hat.com>
To:     Jason Gunthorpe <jgg@...pe.ca>
Cc:     Dan Williams <dan.j.williams@...el.com>, Jan Kara <jack@...e.cz>,
        Dave Chinner <david@...morbit.com>,
        Christopher Lameter <cl@...ux.com>,
        Doug Ledford <dledford@...hat.com>,
        Matthew Wilcox <willy@...radead.org>,
        Ira Weiny <ira.weiny@...el.com>,
        lsf-pc@...ts.linux-foundation.org,
        linux-rdma <linux-rdma@...r.kernel.org>,
        Linux MM <linux-mm@...ck.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        John Hubbard <jhubbard@...dia.com>,
        Michal Hocko <mhocko@...nel.org>
Subject: Re: [LSF/MM TOPIC] Discuss least bad options for resolving
 longterm-GUP usage by RDMA

On Mon, Feb 11, 2019 at 11:06:54AM -0700, Jason Gunthorpe wrote:
> On Mon, Feb 11, 2019 at 09:22:58AM -0800, Dan Williams wrote:
> 
> > I honestly don't like the idea that random subsystems can pin down
> > file blocks as a side effect of gup on the result of mmap. Recall that
> > it's not just RDMA that wants this guarantee. It seems safer to have
> > the file be in an explicit block-allocation-immutable-mode so that the
> > fallocate man page can describe this error case. Otherwise how would
> > you describe the scenarios under which FALLOC_FL_PUNCH_HOLE fails?
> 
> I rather liked CL's version of this - ftruncate/etc is simply racing
> with a parallel pwrite - and it doesn't fail.
> 
> But it also doesnt' trucate/create a hole. Another thread wrote to it
> right away and the 'hole' was essentially instantly reallocated. This
> is an inherent, pre-existing, race in the ftrucate/etc APIs.

So it is kind of a // point to this, but direct I/O do "truncate" pages
or more exactly after a write direct I/O invalidate_inode_pages2_range()
is call and it will try to unmap and remove from page cache all pages
that have been written too.

So we probably want to think about what we want to do here if a device
like RDMA has also pin those pages. Do we want to abort the invalidate ?
Which would mean that then the direct I/O write was just a pointless
exercise. Do we want to not direct I/O but instead memcpy into page
cache memory ? Then you are just ignoring the direct I/O property of
the write. Or do we want to both direct I/O to the block and also
do a memcopy to the page so that we preserve the direct I/O semantics ?

I would probably go with the last one. In any cases we will need to
update the direct I/O code to handle GUPed page cache page.

Cheers,
Jérôme

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ