lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3215177.1684918030@warthog.procyon.org.uk>
Date:   Wed, 24 May 2023 09:47:10 +0100
From:   David Howells <dhowells@...hat.com>
To:     Christoph Hellwig <hch@...radead.org>
Cc:     dhowells@...hat.com, Jens Axboe <axboe@...nel.dk>,
        Al Viro <viro@...iv.linux.org.uk>,
        Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
        Jeff Layton <jlayton@...nel.org>,
        David Hildenbrand <david@...hat.com>,
        Jason Gunthorpe <jgg@...dia.com>,
        Logan Gunthorpe <logang@...tatee.com>,
        Hillf Danton <hdanton@...a.com>,
        Christian Brauner <brauner@...nel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        linux-fsdevel@...r.kernel.org, linux-block@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: Extending page pinning into fs/direct-io.c

Christoph Hellwig <hch@...radead.org> wrote:

> > What I'd like to do is to make the GUP code not take a ref on the zero_page
> > if, say, FOLL_DONT_PIN_ZEROPAGE is passed in, and then make the bio cleanup
> > code always ignore the zero_page.
> 
> I don't think that'll work, as we can't mix different pin vs get types
> in a bio.  And that's really a good thing.

True - but I was thinking of just treating the zero_page specially and never
hold a pin or a ref on it.  It can be checked by address, e.g.:

    static inline void bio_release_page(struct bio *bio, struct page *page)
    {
	    if (page == ZERO_PAGE(0))
		    return;
	    if (bio_flagged(bio, BIO_PAGE_PINNED))
		    unpin_user_page(page);
	    else if (bio_flagged(bio, BIO_PAGE_REFFED))
		    put_page(page);
    }

I'm slightly concerned about the possibility of overflowing the refcount.  The
problem is that it only takes about 2 million pins to do that (because the
zero_page isn't a large folio) - which is within reach of userspace.  Create
an 8GiB anon mmap and do a bunch of async DIO writes from it.  You won't hit
ENOMEM because it will stick ~2 million pointers to zero_page into the page
tables.

> > Something that I noticed is that the dio code seems to wangle to page bits on
> > the target pages for a DIO-read, which seems odd, but I'm not sure I fully
> > understand the code yet.
> 
> I don't understand this sentence.

I was looking at this:

    static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio)
    {
    ...
	    if (dio->is_async && dio_op == REQ_OP_READ && dio->should_dirty)
		    bio_set_pages_dirty(bio);
    ...
    }

but looking again, the lock is taken briefly and the dirty bit is set - which
is reasonable.  However, should we be doing it before starting the I/O?

David

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ