[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZcOQ-0pzA16AEbct@casper.infradead.org>
Date: Wed, 7 Feb 2024 14:17:31 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Will Deacon <will@...nel.org>
Cc: Nanyong Sun <sunnanyong@...wei.com>,
Catalin Marinas <catalin.marinas@....com>, muchun.song@...ux.dev,
akpm@...ux-foundation.org, anshuman.khandual@....com,
wangkefeng.wang@...wei.com, linux-arm-kernel@...ts.infradead.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize
On Wed, Feb 07, 2024 at 12:11:25PM +0000, Will Deacon wrote:
> On Wed, Feb 07, 2024 at 11:21:17AM +0000, Matthew Wilcox wrote:
> > The pte lock cannot be taken in irq context (which I think is what
> > you're asking?) While it is not possible to reason about all users of
> > struct page, we are somewhat relieved of that work by noting that this is
> > only for hugetlbfs, so we don't need to reason about slab, page tables,
> > netmem or zsmalloc.
>
> My concern is that an interrupt handler tries to access a 'struct page'
> which faults due to another core splitting a pmd mapping for the vmemmap.
> In this case, I think we'll end up trying to resolve the fault from irq
> context, which will try to take the spinlock.
Yes, this absolutely can happen (with this patch), and this patch should
be dropped for now.
While this array of ~512 pages have been allocated to hugetlbfs, and one
would think that there would be no way that there could still be
references to them, another CPU can have a pointer to this struct page
(eg attempting a speculative page cache reference or
get_user_pages_fast()). That means it will try to call
atomic_add_unless(&page->_refcount, 1, 0);
Actually, I wonder if this isn't a problem on x86 too? Do we need to
explicitly go through an RCU grace period before freeing the pages
for use by somebody else?
Powered by blists - more mailing lists