lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Wed, 7 Feb 2024 14:17:31 +0000
From: Matthew Wilcox <willy@...radead.org>
To: Will Deacon <will@...nel.org>
Cc: Nanyong Sun <sunnanyong@...wei.com>,
	Catalin Marinas <catalin.marinas@....com>, muchun.song@...ux.dev,
	akpm@...ux-foundation.org, anshuman.khandual@....com,
	wangkefeng.wang@...wei.com, linux-arm-kernel@...ts.infradead.org,
	linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: [PATCH v3 0/3] A Solution to Re-enable hugetlb vmemmap optimize

On Wed, Feb 07, 2024 at 12:11:25PM +0000, Will Deacon wrote:
> On Wed, Feb 07, 2024 at 11:21:17AM +0000, Matthew Wilcox wrote:
> > The pte lock cannot be taken in irq context (which I think is what
> > you're asking?)  While it is not possible to reason about all users of
> > struct page, we are somewhat relieved of that work by noting that this is
> > only for hugetlbfs, so we don't need to reason about slab, page tables,
> > netmem or zsmalloc.
> 
> My concern is that an interrupt handler tries to access a 'struct page'
> which faults due to another core splitting a pmd mapping for the vmemmap.
> In this case, I think we'll end up trying to resolve the fault from irq
> context, which will try to take the spinlock.

Yes, this absolutely can happen (with this patch), and this patch should
be dropped for now.

While this array of ~512 pages have been allocated to hugetlbfs, and one
would think that there would be no way that there could still be
references to them, another CPU can have a pointer to this struct page
(eg attempting a speculative page cache reference or
get_user_pages_fast()).  That means it will try to call
atomic_add_unless(&page->_refcount, 1, 0);

Actually, I wonder if this isn't a problem on x86 too?  Do we need to
explicitly go through an RCU grace period before freeing the pages
for use by somebody else?


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ