lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 14 Sep 2022 04:51:17 +0100
From:   Al Viro <viro@...iv.linux.org.uk>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     Jan Kara <jack@...e.cz>, John Hubbard <jhubbard@...dia.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Jens Axboe <axboe@...nel.dk>,
        Miklos Szeredi <miklos@...redi.hu>,
        "Darrick J . Wong" <djwong@...nel.org>,
        Trond Myklebust <trond.myklebust@...merspace.com>,
        Anna Schumaker <anna@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        Logan Gunthorpe <logang@...tatee.com>,
        linux-block@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-xfs@...r.kernel.org, linux-nfs@...r.kernel.org,
        linux-mm@...ck.org, LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 4/7] iov_iter: new iov_iter_pin_pages*() routines

On Wed, Sep 07, 2022 at 01:45:26AM -0700, Christoph Hellwig wrote:
> On Tue, Sep 06, 2022 at 12:21:06PM +0200, Jan Kara wrote:
> > > For FOLL_PIN callers, never pin bvec and kvec pages:  For file systems
> > > not acquiring a reference is obviously safe, and the other callers will
> > > need an audit, but I can't think of why it woul  ever be unsafe.
> > 
> > Are you sure about "For file systems not acquiring a reference is obviously
> > safe"? I can see places e.g. in orangefs, afs, etc. which create bvec iters
> > from pagecache pages. And then we have iter_file_splice_write() which
> > creates bvec from pipe pages (which can also be pagecache pages if
> > vmsplice() is used). So perhaps there are no lifetime issues even without
> > acquiring a reference (but looking at the code I would not say it is
> > obvious) but I definitely don't see how it would be safe to not get a pin
> > to signal to filesystem backing the pagecache page that there is DMA
> > happening to/from the page.
> 
> I mean in the context of iov_iter_get_pages callers, that is direct
> I/O.  Direct callers of iov_iter_bvec which then pass that iov to
> ->read_iter / ->write_iter will need to hold references (those are
> the references that the callers of iov_iter_get_pages rely on!).

Unless I'm misreading Jan, the question is whether they should get or
pin.  AFAICS, anyone who passes the sucker to ->read_iter() (or ->recvmsg(),
or does direct copy_to_iter()/zero_iter(), etc.) is falling under
=================================================================================
CASE 5: Pinning in order to write to the data within the page
-------------------------------------------------------------
Even though neither DMA nor Direct IO is involved, just a simple case of "pin,
write to a page's data, unpin" can cause a problem. Case 5 may be considered a
superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
other words, if the code is neither Case 1 nor Case 2, it may still require
FOLL_PIN, for patterns like this:

Correct (uses FOLL_PIN calls):
    pin_user_pages()
    write to the data within the pages
    unpin_user_pages()

INCORRECT (uses FOLL_GET calls):
    get_user_pages()
    write to the data within the pages
    put_page()
=================================================================================

Regarding iter_file_splice_write() case, do we need to pin pages
when we are not going to modify the data in those?

The same goes for afs, AFAICS; I started to type "... and everything that passes
WRITE to iov_iter_bvec()", but...
drivers/vhost/vringh.c:1165:            iov_iter_bvec(&iter, READ, iov, ret, translated);
drivers/vhost/vringh.c:1198:            iov_iter_bvec(&iter, WRITE, iov, ret, translated);
is backwards - READ is for data destinations, comes with copy_to_iter(); WRITE is
for data sources and it comes with copy_from_iter()...
I'm really tempted to slap
	if (WARN_ON(i->data_source))
		return 0;
into copy_to_iter() et.al., along with its opposite for copy_from_iter().
And see who comes screaming...  Things like
        if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
                WARN_ON(1);
                return 0;
        }
in e.g. csum_and_copy_from_iter() would be replaced by that, and become
easier to understand...
These two are also getting it wrong, BTW:
drivers/target/target_core_file.c:340:  iov_iter_bvec(&iter, READ, bvec, sgl_nents, len);
drivers/target/target_core_file.c:476:  iov_iter_bvec(&iter, READ, bvec, nolb, len);

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ