lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260122130839.GFaXIhV9SPFzS-RDnj@fat_crate.local>
Date: Thu, 22 Jan 2026 14:08:39 +0100
From: Borislav Petkov <bp@...en8.de>
To: Ard Biesheuvel <ardb@...nel.org>
Cc: linux-kernel@...r.kernel.org, x86@...nel.org,
	Thomas Gleixner <tglx@...utronix.de>,
	Ingo Molnar <mingo@...hat.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>,
	"H. Peter Anvin" <hpa@...or.com>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Peter Zijlstra <peterz@...radead.org>, Kees Cook <kees@...nel.org>,
	Uros Bizjak <ubizjak@...il.com>, Brian Gerst <brgerst@...il.com>,
	linux-hardening@...r.kernel.org
Subject: Re: [RFC/RFT PATCH 01/19] x86/idt: Move idt_table to __ro_after_init
 section

On Thu, Jan 08, 2026 at 09:25:28AM +0000, Ard Biesheuvel wrote:
> Currently, idt_table is allocated as page-aligned .bss, and remapped
> read-only after init. This breaks a 2 MiB large page into 4k page
> mappings, which defeats some of the effort done at boot to map the
> kernel image using large pages, for improved TLB efficiency.
> 
> Mark this allocation as __ro_after_init instead, so it will be made
> read-only automatically after boot, without breaking up large page
> mappings.
> 
> This also fixes a latent bug on i386, where the size of idt_table is
> less than a page, and so remapping it read-only could potentially affect
> other read-write variables too, if those are not page-aligned as well.
> 
> Signed-off-by: Ard Biesheuvel <ardb@...nel.org>
> ---
>  arch/x86/kernel/idt.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
> index f445bec516a0..d6da25d7964f 100644
> --- a/arch/x86/kernel/idt.c
> +++ b/arch/x86/kernel/idt.c
> @@ -170,7 +170,7 @@ static const __initconst struct idt_data apic_idts[] = {
>  };
>  
>  /* Must be page-aligned because the real IDT is used in the cpu entry area */
> -static gate_desc idt_table[IDT_ENTRIES] __page_aligned_bss;
> +static gate_desc idt_table[IDT_ENTRIES] __aligned(PAGE_SIZE) __ro_after_init;
>  
>  static struct desc_ptr idt_descr __ro_after_init = {
>  	.size		= IDT_TABLE_SIZE - 1,
> @@ -308,9 +308,6 @@ void __init idt_setup_apic_and_irq_gates(void)
>  	idt_map_in_cea();
>  	load_idt(&idt_descr);
>  
> -	/* Make the IDT table read only */
> -	set_memory_ro((unsigned long)&idt_table, 1);
> -
>  	idt_setup_done = true;
>  }

Good idea, except my guest shows me something else:

before:

[    0.186281] IDT table: 0xffffffff89c7f000

0xffffffff89c00000-0xffffffff89c7f000         508K     RW                 GLB NX pte
0xffffffff89c7f000-0xffffffff89c80000           4K     ro                 GLB NX pte
0xffffffff89c80000-0xffffffff89e00000        1536K     RW                 GLB NX pte
0xffffffff89e00000-0xffffffff8be00000          32M     RW         PSE     GLB NX pmd

This is clearly a single, 4K RO pageframe right in the middle of a splintered
2M page.

after:

[    0.180635] IDT table: 0xffffffff822cf000

0xffffffff81e00000-0xffffffff82200000           4M     ro         PSE     GLB NX pmd
0xffffffff82200000-0xffffffff8236f000        1468K     ro                 GLB NX pte
0xffffffff8236f000-0xffffffff82400000         580K     RW                 GLB NX pte
0xffffffff82400000-0xffffffff89800000         116M     RW         PSE     GLB NX pmd

but after applying your patch it looks like it still broke the 2M mapping as
the remaining piece is RW.

If I do this:

static gate_desc idt_table[IDT_ENTRIES] __aligned(PMD_SIZE) __ro_after_init;

it still doesn't help:

[    0.197808] IDT table: 0xffffffff82800000

0xffffffff81e00000-0xffffffff82800000          10M     ro         PSE     GLB NX pmd
0xffffffff82800000-0xffffffff828a0000         640K     ro                 GLB NX pte
0xffffffff828a0000-0xffffffff82a00000        1408K     RW                 GLB NX pte
0xffffffff82a00000-0xffffffff89e00000         116M     RW         PSE     GLB NX pmd

because that trailing piece of the 2M page is still RW.

And who knows what else am I breaking when doing this:

[    2.368601] ------------[ cut here ]------------
[    2.389816] [CRTC:35:crtc-0] vblank wait timed out
[    2.396676] WARNING: drivers/gpu/drm/drm_atomic_helper.c:1920 at drm_atomic_helper_wait_for_vblanks.part.0+0x1ba/0x1e0, CPU#1: kworker/1:0/57
[    2.406715] Modules linked in:
[    2.408462] CPU: 1 UID: 0 PID: 57 Comm: kworker/1:0 Not tainted 6.19.0-rc6+ #4 PREEMPT(full)
...

I don't know, sacrificing a 2M page just for the idt_table and so that it
doesn't get splintered, not sure it is worth it.

Hmmm.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ