lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z6_d1MvorGFpxdU1@MacBook-Air-5.local>
Date: Sat, 15 Feb 2025 09:20:36 +0900
From: "Harry (Hyeonggon) Yoo" <42.hyeyoo@...il.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@...el.com>,
	linux-kernel@...r.kernel.org, osalvador@...e.de, byungchul@...com,
	dave.hansen@...ux.intel.com, luto@...nel.org, peterz@...radead.org,
	akpm@...ux-foundation.org, max.byungchul.park@...com,
	max.byungchul.park@...il.com
Subject: Re: [RFC 1/1] x86/vmemmap: Add missing update of PML4 table / PML5
 table entry

On Fri, Feb 14, 2025 at 11:57:50AM -0800, Dave Hansen wrote:
> On 2/14/25 11:51, Gwan-gyeong Mun wrote:
> > when performing vmemmap populate, if the entry of the PML4 table/PML5 table
> > pointing to the target virtual address has never been updated, a page fault
> > occurs when the memset(start) called from the vmemmap_use_new_sub_pmd()
> > execution flow.
> 
> "Page fault" meaning oops? Or something that we manage to handle and
> return from without oopsing?

It means oops, because the kernel accesses part of vmemmap that's not
populated (yet) in current process's page table.

This oops was observed after increasing the size of struct page (as a part of
developing a debug feature), but the real cause is that page table entries are
only installed in init_mm's page table and then sync'd later, but in the mean
time the process that triggered hot-plug accesses new portion of vmemmap.

If the process does not directly use the page table of init_mm (like swapper)
this oops can occur (e.g., I was able to trigger with `sudo modprobe hmm_test`
after increasing the size of struct page).

> > This fixes the problem of using the virtual address without updating the
> > entry in the PML4 table or PML5 table. But this is a temporary solution to
> > prevent page fault problems, and it requires improvement of the routine
> > that updates the missing entry in the PML4 table or PML5 table.
> 
> Can we please skip past the band-aid and go to the real fix?

Yes, of course it'd best to skip a temporary fix.
The intention is to report/discuss the problem and a fix as a starting point.

-- 
Harry

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ