[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210203223017.GK6468@xz-x1>
Date: Wed, 3 Feb 2021 17:30:17 -0500
From: Peter Xu <peterx@...hat.com>
To: Mike Kravetz <mike.kravetz@...cle.com>
Cc: linux-kernel@...r.kernel.org, linux-mm@...ck.org,
Wei Zhang <wzam@...zon.com>,
Matthew Wilcox <willy@...radead.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Jason Gunthorpe <jgg@...pe.ca>,
Gal Pressman <galpress@...zon.com>,
Christoph Hellwig <hch@....de>,
Andrea Arcangeli <aarcange@...hat.com>,
Jan Kara <jack@...e.cz>,
Kirill Shutemov <kirill@...temov.name>,
David Gibson <david@...son.dropbear.id.au>,
Mike Rapoport <rppt@...ux.vnet.ibm.com>,
Kirill Tkhai <ktkhai@...tuozzo.com>,
Jann Horn <jannh@...gle.com>,
Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH 4/4] hugetlb: Do early cow when page pinned on src mm
On Wed, Feb 03, 2021 at 02:04:30PM -0800, Mike Kravetz wrote:
> > @@ -3816,6 +3832,54 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
> > }
> > set_huge_swap_pte_at(dst, addr, dst_pte, entry, sz);
> > } else {
> > + entry = huge_ptep_get(src_pte);
> > + ptepage = pte_page(entry);
> > + get_page(ptepage);
> > +
> > + if (unlikely(page_needs_cow_for_dma(vma, ptepage))) {
> > + /* This is very possibly a pinned huge page */
> > + if (!prealloc) {
> > + /*
> > + * Preallocate the huge page without
> > + * tons of locks since we could sleep.
> > + * Note: we can't use any reservation
> > + * because the page will be exclusively
> > + * owned by the child later.
> > + */
> > + put_page(ptepage);
> > + spin_unlock(src_ptl);
> > + spin_unlock(dst_ptl);
> > + prealloc = alloc_huge_page(vma, addr, 0);
>
> One quick question:
>
> The comment says we can't use any reservation, and I agree. However, the
> alloc_huge_page call has 0 as the avoid_reserve argument. Shouldn't that
> be !0 to avoid reserves?
Good point.. so I obviously wanted to skip reservation check but successfully
got cheated by the inverted name. :)
Though I do checked the reservation, so it seems not extremely important - when
we fork and copy the vma, we have already dropped the vma resv map:
if (is_vm_hugetlb_page(tmp))
reset_vma_resv_huge_pages(tmp);
Then in alloc_huge_page() we checked vma_resv_map() mostly everywhere we'd
check avoid_reserve too (either in vma_needs_reservation, or calculating
deferred_reserve). It seems to be mostly useful when vma_resv_map() existed.
But I completely agree I should pass in "1" here in v2.
Thanks,
--
Peter Xu
Powered by blists - more mailing lists