[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200917185049.GV5449@casper.infradead.org>
Date: Thu, 17 Sep 2020 19:50:49 +0100
From: Matthew Wilcox <willy@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Michael Larabel <Michael@...haellarabel.com>,
Matthieu Baerts <matthieu.baerts@...sares.net>,
Amir Goldstein <amir73il@...il.com>,
Ted Ts'o <tytso@...gle.com>,
Andreas Dilger <adilger.kernel@...ger.ca>,
Ext4 Developers List <linux-ext4@...r.kernel.org>,
Jan Kara <jack@...e.cz>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>
Subject: Re: Kernel Benchmarking
On Thu, Sep 17, 2020 at 11:30:00AM -0700, Linus Torvalds wrote:
> On Thu, Sep 17, 2020 at 11:23 AM Matthew Wilcox <willy@...radead.org> wrote:
> >
> > Something like taking
> > the i_mmap_lock_read(file->f_mapping) in filemap_fault, then adding a
> > new VM_FAULT_I_MMAP_LOCKED bit so that do_read_fault() and friends add:
> >
> > if (ret & VM_FAULT_I_MMAP_LOCKED)
> > i_mmap_unlock_read(vmf->vma->vm_file->f_mapping);
> > else
> > unlock_page(page);
> >
> > ... want me to turn that into a real patch?
>
> I can't guarantee it's the right model - it does worry me how many
> places we might get that i_mmap_rwlock, and how long we migth hold it
> for writing, and what deadlocks it might cause when we take it for
> reading in the page fault path.
>
> But I think it might be very interesting as a benchmark patch and a
> trial balloon. Maybe it "just works".
Ahh. Here's a race this doesn't close:
int truncate_inode_page(struct address_space *mapping, struct page *page)
{
VM_BUG_ON_PAGE(PageTail(page), page);
if (page->mapping != mapping)
return -EIO;
truncate_cleanup_page(mapping, page);
delete_from_page_cache(page);
return 0;
}
truncate_cleanup_page() does
if (page_mapped(page)) {
pgoff_t nr = PageTransHuge(page) ? HPAGE_PMD_NR : 1;
unmap_mapping_pages(mapping, page->index, nr, false);
}
but ->mapping isn't cleared until delete_from_page_cache() many
instructions later. So we can get the lock and have a page which appears
to be not-truncated, only for it to get truncated on us later.
> I would _love_ for the page lock itself to be only (or at least
> _mainly_) about the actual IO synchronization on the page.
>
> That was the origin of it, the whole "protect all the complex state of
> a page" behavior kind of grew over time, since it was the only
> per-page lock we had.
Yes, I think that's a noble goal.
Powered by blists - more mailing lists