lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 3 Jun 2024 21:12:52 +0000
From: Wei Yang <richard.weiyang@...il.com>
To: Dave Hansen <dave.hansen@...el.com>
Cc: Wei Yang <richard.weiyang@...il.com>, tglx@...utronix.de,
	mingo@...hat.com, bp@...en8.de, dave.hansen@...ux.intel.com,
	x86@...nel.org, linux-kernel@...r.kernel.org,
	"Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
	Ingo Molnar <mingo@...nel.org>, Steve Wahl <steve.wahl@....com>
Subject: Re: [Patch v3] x86/head/64: remove redundant check on
 level2_kernel_pgt's _PAGE_PRESENT bit

On Mon, Jun 03, 2024 at 11:50:06AM -0700, Dave Hansen wrote:
>On 5/23/24 05:35, Wei Yang wrote:
>> --- a/arch/x86/kernel/head64.c
>> +++ b/arch/x86/kernel/head64.c
>> @@ -260,8 +260,7 @@ unsigned long __head __startup_64(unsigned long physaddr,
>>  
>>  	/* fixup pages that are part of the kernel image */
>>  	for (; i <= pmd_index((unsigned long)_end); i++)
>> -		if (pmd[i] & _PAGE_PRESENT)
>> -			pmd[i] += load_delta;
>> +		pmd[i] += load_delta;
>
>So, I think this is correct.  But, man, I wish folks would go through
>the git history and make it clear that they understand _how_ thecode
>got the way it is.
>

Dave

Thanks for your comment.

In my first version, it lists the historical change, while Thomas thought they
are not relevant. So I remove those descriptions.

https://lkml.org/lkml/2024/3/23/350

>I suspect that the original _PAGE_PRESENT check wasn't even necessary if
>cleanup_highmap() really did fix things up.  But this commit:
>
>	2aa85f246c18 ("x86/boot/64: Make level2_kernel_pgt pages invalid
>		       outside kernel area")
>
>tweaked things to actively clear out PMDs that weren't populated in
>Kirill's original loop.  It didn't touch the _PAGE_PRESENT check.  But
>it certainly did imply that the PMD doesn't have any holes in it and
>there's nothing int he middle that needs _PAGE_PRESENT cleared.
>

As I mentioned in my first version, the original code is introduced by

	commit 1ab60e0f72f7 ("[PATCH] x86-64: Relocatable Kernel Support")

The reason for the check on _PAGE_PRESENT is at that moment, level2_kernel_pgt
is defined as:

NEXT_PAGE(level2_kernel_pgt)
	/* 40MB kernel mapping. The kernel code cannot be bigger than that.
	   When you change this change KERNEL_TEXT_SIZE in page.h too. */
	/* (2^48-(2*1024*1024*1024)-((2^39)*511)-((2^30)*510)) = 0 */
	PMDS(0x0000000000000000, __PAGE_KERNEL_LARGE_EXEC|_PAGE_GLOBAL,
		KERNEL_TEXT_SIZE/PMD_SIZE)
	/* Module mapping starts here */
	.fill	(PTRS_PER_PMD - (KERNEL_TEXT_SIZE/PMD_SIZE)),8,0

While now, it looks like this:

SYM_DATA_START_PAGE_ALIGNED(level2_kernel_pgt)
	/*
	 * Kernel high mapping.
	 *
	 * The kernel code+data+bss must be located below KERNEL_IMAGE_SIZE in
	 * virtual address space, which is 1 GiB if RANDOMIZE_BASE is enabled,
	 * 512 MiB otherwise.
	 *
	 * (NOTE: after that starts the module area, see MODULES_VADDR.)
	 *
	 * This table is eventually used by the kernel during normal runtime.
	 * Care must be taken to clear out undesired bits later, like _PAGE_RW
	 * or _PAGE_GLOBAL in some cases.
	 */
	PMDS(0, __PAGE_KERNEL_LARGE_EXEC, KERNEL_IMAGE_SIZE/PMD_SIZE)
SYM_DATA_END(level2_kernel_pgt)

The difference is at the original version, level2_kernel_pgt is not all
defined with _PAGE_PRESENT set. I didn't dig into from which commit we expand
the level2_kernel_pgt to full, while I think from that point, the check is
redundant.

>> level2_kernel_pgt compiled with _PAGE_PRESENT set. The check is
>> redundant
>
>This isn't super reassuring.  It also depends on nothing having munged
>the page tables up to this point.  The code is also a bit cruel in that
>it manipulates two different sets of PMDs with the same 'pmd' variable.
>
>Also, is this comment still accurate after '2aa85f246c18'?
>
>>          * Fixup the kernel text+data virtual addresses. Note that
>>          * we might write invalid pmds, when the kernel is relocated
>>          * cleanup_highmap() fixes this up along with the mappings
>>          * beyond _end.

Sounds this is not necessary any more. Do you prefer to remove this in next
version of this patch.

-- 
Wei Yang
Help you, Help me

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ