lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <36aabfe5-b862-404b-8175-ebe5dab59427@csgroup.eu>
Date: Fri, 5 Sep 2025 11:43:12 +0200
From: Christophe Leroy <christophe.leroy@...roup.eu>
To: Andrew Donnellan <ajd@...ux.ibm.com>,
 Michael Ellerman <mpe@...erman.id.au>, Nicholas Piggin <npiggin@...il.com>,
 Madhavan Srinivasan <maddy@...ux.ibm.com>
Cc: linux-kernel@...r.kernel.org, linuxppc-dev@...ts.ozlabs.org,
 Erhard Furtner <erhard_f@...lbox.org>
Subject: Re: [PATCH] powerpc/32: Remove PAGE_KERNEL_TEXT to fix startup
 failure



Le 05/09/2025 à 08:57, Andrew Donnellan a écrit :
> On Thu, 2025-09-04 at 18:33 +0200, Christophe Leroy wrote:
>> PAGE_KERNEL_TEXT is an old macro that is used to tell kernel whether
>> kernel text has to be mapped read-only or read-write based on build
>> time options.
>>
>> But nowadays, with functionnalities like jump_labels, static links,
>> etc ... more only less all kernels need to be read-write at some
>> point, and some combinations of configs failed to work due to
>> innacurate setting of PAGE_KERNEL_TEXT. On the other hand, today
>> we have CONFIG_STRICT_KERNEL_RWX which implements a more controlled
>> access to kernel modifications.
>>
>> Instead of trying to keep PAGE_KERNEL_TEXT accurate with all
>> possible options that may imply kernel text modification, always
>> set kernel text read-write at startup and rely on
>> CONFIG_STRICT_KERNEL_RWX to provide accurate protection.
>>
>> Reported-by: Erhard Furtner <erhard_f@...lbox.org>
>> Closes:
>> https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F342b4120-911c-4723-82ec-d8c9b03a8aef%40mailbox.org%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Ce1df868f94284b06db0508ddec497516%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638926522413828188%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=cqhzYIshhwKJluL2U2ULuNYoQ1CR1ZP0nsl5pb3wHd4%3D&reserved=0
>> Signed-off-by: Christophe Leroy <christophe.leroy@...roup.eu>
> 
> The original issue that Erhard and I were investigating was why the latest
> version of the PowerPC page table check series[0] was failing on his G4, when
> built as part of a config with many other debugging options enabled.
> 
> With further instrumentation, it turns out that this was due to a failed
> instruction patch while setting up a jump label for the
> page_table_check_disabled static key, which was being checked in
> page_table_check_pte_clear(), which was in turn inlined ultimately into
> debug_vm_pgtable().
> 
> This patch seems to fix the problem, so:
> 
> Tested-by: Andrew Donnellan <ajd@...ux.ibm.com>
> 
> But I'm still curious about why I only see the issue when:
> 
>    (a) CONFIG_KFENCE=y (even when disabled using kfence.sample_interval=0) -
> noting that changing CONFIG_KFENCE doesn't change the definition of
> PAGE_KERNEL_TEXT; and
> 
>    (b) when the jump label ends up in a __init function (removing __init from
> debug_vm_pgtable() and its associated functions, or changing the code in such a
> way that the static key check doesn't get inlined, resolves the issue, and
> similarly for test_static_call_init() when CONFIG_STATIC_CALL_SELFTEST=y).
> 
> I don't understand the mm code well enough to make sense of this.

That makes sense. When CONFIG_KFENCE is selected, only text and rodata 
are mapped with BATs. Everything else including inittext is mapped with 
pages. When CONFIG_KFENCE and CONFIG_DEBUG_PAGEALLOC are not selected, 
we map as much as possible with BATs.

And as you can see below, BATs are mapped with PAGE_KERNEL_X not with 
PAGE_KERNEL_TEXT.

Everything happen here below:

static unsigned long __init __mmu_mapin_ram(unsigned long base, unsigned 
long top)
{
	int idx;

	while ((idx = find_free_bat()) != -1 && base != top) {
		unsigned int size = bat_block_size(base, top);

		if (size < 128 << 10)
			break;
		setbat(idx, PAGE_OFFSET + base, base, size, PAGE_KERNEL_X);
		base += size;
	}

	return base;
}

unsigned long __init mmu_mapin_ram(unsigned long base, unsigned long top)
{
	unsigned long done;
	unsigned long border = (unsigned long)__srwx_boundary - PAGE_OFFSET;
	unsigned long size;

	size = roundup_pow_of_two((unsigned long)_einittext - PAGE_OFFSET);
	setibat(0, PAGE_OFFSET, 0, size, PAGE_KERNEL_X);

	if (debug_pagealloc_enabled_or_kfence()) {
		pr_debug_once("Read-Write memory mapped without BATs\n");
		if (base >= border)
			return base;
		if (top >= border)
			top = border;
	}

	if (!strict_kernel_rwx_enabled() || base >= border || top <= border)
		return __mmu_mapin_ram(base, top);

	done = __mmu_mapin_ram(base, border);
	if (done != border)
		return done;

	return __mmu_mapin_ram(border, top);
}


> 
> [0] https://eur01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fall%2F20250813062614.51759-1-ajd%40linux.ibm.com%2F&data=05%7C02%7Cchristophe.leroy%40csgroup.eu%7Ce1df868f94284b06db0508ddec497516%7C8b87af7d86474dc78df45f69a2011bb5%7C0%7C0%7C638926522413849910%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=1slIkZ4krf2sWUaKJ%2FayEX8t9dKpfsrDiAxZRohKfRQ%3D&reserved=0
> 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ