[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3215177.1684918030@warthog.procyon.org.uk>
Date: Wed, 24 May 2023 09:47:10 +0100
From: David Howells <dhowells@...hat.com>
To: Christoph Hellwig <hch@...radead.org>
Cc: dhowells@...hat.com, Jens Axboe <axboe@...nel.dk>,
Al Viro <viro@...iv.linux.org.uk>,
Matthew Wilcox <willy@...radead.org>, Jan Kara <jack@...e.cz>,
Jeff Layton <jlayton@...nel.org>,
David Hildenbrand <david@...hat.com>,
Jason Gunthorpe <jgg@...dia.com>,
Logan Gunthorpe <logang@...tatee.com>,
Hillf Danton <hdanton@...a.com>,
Christian Brauner <brauner@...nel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
linux-fsdevel@...r.kernel.org, linux-block@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: Extending page pinning into fs/direct-io.c
Christoph Hellwig <hch@...radead.org> wrote:
> > What I'd like to do is to make the GUP code not take a ref on the zero_page
> > if, say, FOLL_DONT_PIN_ZEROPAGE is passed in, and then make the bio cleanup
> > code always ignore the zero_page.
>
> I don't think that'll work, as we can't mix different pin vs get types
> in a bio. And that's really a good thing.
True - but I was thinking of just treating the zero_page specially and never
hold a pin or a ref on it. It can be checked by address, e.g.:
static inline void bio_release_page(struct bio *bio, struct page *page)
{
if (page == ZERO_PAGE(0))
return;
if (bio_flagged(bio, BIO_PAGE_PINNED))
unpin_user_page(page);
else if (bio_flagged(bio, BIO_PAGE_REFFED))
put_page(page);
}
I'm slightly concerned about the possibility of overflowing the refcount. The
problem is that it only takes about 2 million pins to do that (because the
zero_page isn't a large folio) - which is within reach of userspace. Create
an 8GiB anon mmap and do a bunch of async DIO writes from it. You won't hit
ENOMEM because it will stick ~2 million pointers to zero_page into the page
tables.
> > Something that I noticed is that the dio code seems to wangle to page bits on
> > the target pages for a DIO-read, which seems odd, but I'm not sure I fully
> > understand the code yet.
>
> I don't understand this sentence.
I was looking at this:
static inline void dio_bio_submit(struct dio *dio, struct dio_submit *sdio)
{
...
if (dio->is_async && dio_op == REQ_OP_READ && dio->should_dirty)
bio_set_pages_dirty(bio);
...
}
but looking again, the lock is taken briefly and the dirty bit is set - which
is reasonable. However, should we be doing it before starting the I/O?
David
Powered by blists - more mailing lists