[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADrL8HVDyT6D-O=BoeHkA9oaRPLJR62Sxba8FdTjMaQYW-Ttfw@mail.gmail.com>
Date: Tue, 17 Jan 2023 13:38:24 -0800
From: James Houghton <jthoughton@...gle.com>
To: Peter Xu <peterx@...hat.com>
Cc: Mike Kravetz <mike.kravetz@...cle.com>,
Muchun Song <songmuchun@...edance.com>,
David Hildenbrand <david@...hat.com>,
David Rientjes <rientjes@...gle.com>,
Axel Rasmussen <axelrasmussen@...gle.com>,
Mina Almasry <almasrymina@...gle.com>,
"Zach O'Keefe" <zokeefe@...gle.com>,
Manish Mishra <manish.mishra@...anix.com>,
Naoya Horiguchi <naoya.horiguchi@....com>,
"Dr . David Alan Gilbert" <dgilbert@...hat.com>,
"Matthew Wilcox (Oracle)" <willy@...radead.org>,
Vlastimil Babka <vbabka@...e.cz>,
Baolin Wang <baolin.wang@...ux.alibaba.com>,
Miaohe Lin <linmiaohe@...wei.com>,
Yang Shi <shy828301@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 35/46] hugetlb: add MADV_COLLAPSE for hugetlb
> > + if (curr < end) {
> > + /* Don't hold the VMA lock for too long. */
> > + hugetlb_vma_unlock_write(vma);
> > + cond_resched();
> > + hugetlb_vma_lock_write(vma);
>
> The intention is good here but IIUC this will cause vma lock to be taken
> after the i_mmap_rwsem, which can cause circular deadlocks. If to do this
> properly we'll need to also release the i_mmap_rwsem.
Sorry if you spent a long time debugging this! I sent a reply a week
ago about this too.
>
> However it may make the resched() logic over complicated, meanwhile for 2M
> huge pages I think this will be called for each 2M range which can be too
> fine grained, so it looks like the "cur < end" check is a bit too aggresive.
>
> The other thing is I noticed that the long period of mmu notifier
> invalidate between start -> end will (in reallife VM context) causing vcpu
> threads spinning.
>
> I _think_ it's because is_page_fault_stale() (when during a vmexit
> following a kvm page fault) always reports true during the long procedure
> of MADV_COLLAPSE if to be called upon a large range, so even if we release
> both locks here it may not tremedously on the VM migration use case because
> of the long-standing mmu notifier invalidation procedure.
Oh... indeed. Thanks for pointing that out.
>
> To summarize.. I think a simpler start version of hugetlb MADV_COLLAPSE can
> drop this "if" block, and let the userapp decide the step size of COLLAPSE?
I'll drop this resched logic. Thanks Peter.
Powered by blists - more mailing lists