[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210713111139.GG12142@quack2.suse.cz>
Date: Tue, 13 Jul 2021 13:11:39 +0200
From: Jan Kara <jack@...e.cz>
To: "Darrick J. Wong" <djwong@...nel.org>
Cc: Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org,
linux-ext4@...r.kernel.org, Christoph Hellwig <hch@...radead.org>,
Ted Tso <tytso@....edu>, Dave Chinner <david@...morbit.com>,
Matthew Wilcox <willy@...radead.org>, linux-mm@...ck.org,
linux-xfs@...r.kernel.org, linux-f2fs-devel@...ts.sourceforge.net,
linux-cifs@...r.kernel.org, ceph-devel@...r.kernel.org,
Christoph Hellwig <hch@....de>
Subject: Re: [PATCH 03/14] mm: Protect operations adding pages to page cache
with invalidate_lock
On Mon 12-07-21 18:25:14, Darrick J. Wong wrote:
> On Mon, Jul 12, 2021 at 06:55:54PM +0200, Jan Kara wrote:
> > @@ -2967,6 +2992,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > pgoff_t max_off;
> > struct page *page;
> > vm_fault_t ret = 0;
> > + bool mapping_locked = false;
> >
> > max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE);
> > if (unlikely(offset >= max_off))
> > @@ -2988,15 +3014,30 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > count_memcg_event_mm(vmf->vma->vm_mm, PGMAJFAULT);
> > ret = VM_FAULT_MAJOR;
> > fpin = do_sync_mmap_readahead(vmf);
> > + }
> > +
> > + if (!page) {
>
> Is it still necessary to re-evaluate !page here?
No, you are right it is not necessary. I'll remove it.
> > retry_find:
> > + /*
> > + * See comment in filemap_create_page() why we need
> > + * invalidate_lock
> > + */
> > + if (!mapping_locked) {
> > + filemap_invalidate_lock_shared(mapping);
> > + mapping_locked = true;
> > + }
> > page = pagecache_get_page(mapping, offset,
> > FGP_CREAT|FGP_FOR_MMAP,
> > vmf->gfp_mask);
> > if (!page) {
> > if (fpin)
> > goto out_retry;
> > + filemap_invalidate_unlock_shared(mapping);
> > return VM_FAULT_OOM;
> > }
> > + } else if (unlikely(!PageUptodate(page))) {
> > + filemap_invalidate_lock_shared(mapping);
> > + mapping_locked = true;
> > }
> >
> > if (!lock_page_maybe_drop_mmap(vmf, page, &fpin))
> > @@ -3014,8 +3055,20 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > * We have a locked page in the page cache, now we need to check
> > * that it's up-to-date. If not, it is going to be due to an error.
> > */
> > - if (unlikely(!PageUptodate(page)))
> > + if (unlikely(!PageUptodate(page))) {
> > + /*
> > + * The page was in cache and uptodate and now it is not.
> > + * Strange but possible since we didn't hold the page lock all
> > + * the time. Let's drop everything get the invalidate lock and
> > + * try again.
> > + */
> > + if (!mapping_locked) {
> > + unlock_page(page);
> > + put_page(page);
> > + goto retry_find;
> > + }
> > goto page_not_uptodate;
> > + }
> >
> > /*
> > * We've made it this far and we had to drop our mmap_lock, now is the
> > @@ -3026,6 +3079,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > unlock_page(page);
> > goto out_retry;
> > }
> > + if (mapping_locked)
> > + filemap_invalidate_unlock_shared(mapping);
> >
> > /*
> > * Found the page and have a reference on it.
> > @@ -3056,6 +3111,7 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> >
> > if (!error || error == AOP_TRUNCATED_PAGE)
> > goto retry_find;
> > + filemap_invalidate_unlock_shared(mapping);
>
> Hm. I /think/ it's the case that mapping_locked==true always holds here
> because the new "The page was in cache and uptodate and now it is not."
> block above will take the invalidate_lock and retry pagecache_get_page,
> right?
Yes. page_not_uptodate block can only be entered with mapping_locked ==
true - the only place that can enter this block is:
if (unlikely(!PageUptodate(page))) {
/*
* The page was in cache and uptodate and now it is not.
* Strange but possible since we didn't hold the page lock all
* the time. Let's drop everything get the invalidate lock and
* try again.
*/
if (!mapping_locked) {
unlock_page(page);
put_page(page);
goto retry_find;
}
goto page_not_uptodate;
}
> >
> > return VM_FAULT_SIGBUS;
> >
> > @@ -3067,6 +3123,8 @@ vm_fault_t filemap_fault(struct vm_fault *vmf)
> > */
> > if (page)
> > put_page(page);
> > + if (mapping_locked)
> > + filemap_invalidate_unlock_shared(mapping);
>
> Hm. I think this looks ok, even though this patch now contains the
> subtlety that we've both hoisted the xfs mmaplock to page cache /and/
> reduced the scope of the invalidate_lock.
>
> As for fancy things like remap_range, I think they're still safe with
> this latest iteration because those functions grab the invalidate_lock
> in exclusive mode and invalidate the mappings before proceeding, which
> means that other programs will never find the lockless path (i.e. page
> locked, uptodate, and attached to the mapping) and will instead block on
> the invalidate lock until the remap operation completes. Is that
> right?
Correct. For operations such as hole punch or destination of remap_range,
we lock invalidate_lock exclusively and invalidate pagecache in the
involved range. No new pages can be created in that range until you drop
invalidate_lock (places creating pages without holding i_rwsem are read,
readahead, fault and all those take invalidate_lock when they should create
the page).
There's also the case someone pointed out that *source* of remap_range
needs to be protected (but only from modifications through mmap). This is
achieved by having invalidate_lock taken in .page_mkwrite handlers and
thus not impacted by these changes to filemap_fault().
Honza
--
Jan Kara <jack@...e.com>
SUSE Labs, CR
Powered by blists - more mailing lists