[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YaklihoYztAoKfxX@casper.infradead.org>
Date: Thu, 2 Dec 2021 19:59:06 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Jann Horn <jannh@...gle.com>, Jan Kara <jack@...e.cz>,
Kirill Shutemov <kirill@...temov.name>,
Oleg Nesterov <oleg@...hat.com>,
Christoph Hellwig <hch@....de>, Linux-MM <linux-mm@...ck.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
Mike Kravetz <mike.kravetz@...cle.com>
Subject: Re: [5.4 PATCH] mm/gup: Do not force a COW break on file-backed
memory
On Thu, Dec 02, 2021 at 10:54:48AM -0800, Linus Torvalds wrote:
> On Wed, Dec 1, 2021 at 8:11 PM Matthew Wilcox <willy@...radead.org> wrote:
> >
> > The other patch we've been kicking around (and works) is:
> >
> > static inline bool should_force_cow_break(struct vm_area_struct *vma, unsigned
> > int flags)
> > {
> > - return is_cow_mapping(vma->vm_flags) && (flags & FOLL_GET);
> > + return is_cow_mapping(vma->vm_flags) &&
> > + (!(vma->vm_flags & VM_DENYWRITE)) && (flags & FOLL_GET);
> > }
>
> That patch makes no sense to me.
>
> It may "work", but it doesn't actually do anything sensible or really
> fix the problem that I can tell.
Oh absolutely, it's semantically nonsense. The only reason it fixes the
problem is that VM_DENYWRITE VMAs are the only ones considered for the
RO_THP merging, so they're the only ones which we've seen causing a
problem.
> I suspect a real fix would be bigger and more invasive.
Darn. I was hoping you were going to say something like "The real
problem is follow_trans_huge_pmd() is complete garbage and it should
just do X, Y and Z". Or "When we force on FOLL_WRITE, we should also
force on FOLL_SPLIT_PMD".
> If the answer is not to backport all the other changes (and they were
> _really_ invasive), I think one answer may be to simply move the
> "should_force_cow_break()" down to below the point where you've looked
> up the page.
>
> Then you can actually look at "is this a file mapped page", and say
> "if so, that's ok, we can return it as-is".
>
> Otherwise, you do something like
>
> foll_flags |= FOLL_WRITE;
> free_page(page);
> goto repeat;
>
> to repeat the loop (now with FOLL_WRITE).
>
> So the patch is bigger and more involved, because you would have done
> the page lookup (for reading) and now notice "Oh, I need it for
> writing instead" so you need to undo and re-do).
>
> But at least - unlike backporting everything else - it would be
> limited to that one __get_user_pages() function.
>
> Hmm?
>
> (And you'd need to handle that follow_hugetlb_page() case too), not
> just the follow_page_mask() one)
Thanks, I'll take a look.
Powered by blists - more mailing lists