[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKFNMokFvcMdAfsvRy6JVpWGnr6BtqUOwH7nmyS=1K51HD1vYQ@mail.gmail.com>
Date: Tue, 27 Jan 2026 05:41:56 +0900
From: Ryusuke Konishi <konishi.ryusuke@...il.com>
To: Andrew Cooper <andrew.cooper3@...rix.com>
Cc: Borislav Petkov <bp@...en8.de>, Andrew Morton <akpm@...ux-foundation.org>,
Marco Elver <elver@...gle.com>, LKML <linux-kernel@...r.kernel.org>,
Alexander Potapenko <glider@...gle.com>, Dmitry Vyukov <dvyukov@...gle.com>,
Thomas Gleixner <tglx@...utronix.de>, Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, X86 ML <x86@...nel.org>,
"H. Peter Anvin" <hpa@...or.com>, Jann Horn <jannh@...gle.com>, kasan-dev@...glegroups.com
Subject: Re: [REGRESSION] x86_32 boot hang in 6.19-rc7 caused by b505f1944535
("x86/kfence: avoid writing L1TF-vulnerable PTEs")
On Tue, Jan 27, 2026 at 5:22 AM Andrew Cooper wrote:
>
> On 26/01/2026 7:54 pm, Borislav Petkov wrote:
> > On Tue, Jan 27, 2026 at 04:07:04AM +0900, Ryusuke Konishi wrote:
> >> Hi All,
> >>
> >> I am reporting a boot regression in v6.19-rc7 on an x86_32
> >> environment. The kernel hangs immediately after "Booting the kernel"
> >> and does not produce any early console output.
> >>
> >> A git bisect identified the following commit as the first bad commit:
> >> b505f1944535 ("x86/kfence: avoid writing L1TF-vulnerable PTEs")
> > I can confirm the same - my 32-bit laptop experiences the same. The guest
> > splat looks like this:
> >
> > [ 0.173437] rcu: srcu_init: Setting srcu_struct sizes based on contention.
> > [ 0.175172] ------------[ cut here ]------------
> > [ 0.176066] kernel BUG at arch/x86/mm/physaddr.c:70!
> > [ 0.177037] Oops: invalid opcode: 0000 [#1] SMP
> > [ 0.177914] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.19.0-rc7+ #1 PREEMPT(full)
> > [ 0.179509] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> > [ 0.181363] EIP: __phys_addr+0x78/0x90
> > [ 0.182089] Code: 89 c8 5b 5d c3 2e 8d 74 26 00 0f 0b 8d b6 00 00 00 00 89 45 f8 e8 08 a4 1d 00 84 c0 8b 55 f8 74 b0 0f 0b 8d b4 26 00 00 00 00 <0f> 0b 8d b6 00 00 00 00 0f 0b 66 90 8d 74 26 00 2e 8d b4 26 00 00
> > [ 0.185723] EAX: ce383000 EBX: 00031c7c ECX: 31c7c000 EDX: 034ec000
> > [ 0.186972] ESI: c1ed3eec EDI: f21fd101 EBP: c2055f78 ESP: c2055f70
> > [ 0.188182] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210086
> > [ 0.189503] CR0: 80050033 CR2: ffd98000 CR3: 029cf000 CR4: 00000090
> > [ 0.191045] Call Trace:
> > [ 0.191518] kfence_init+0x3a/0x94
> > [ 0.192177] start_kernel+0x4ea/0x62c
> > [ 0.192894] i386_start_kernel+0x65/0x68
> > [ 0.193653] startup_32_smp+0x151/0x154
> > [ 0.194397] Modules linked in:
> > [ 0.194987] ---[ end trace 0000000000000000 ]---
> > [ 0.195879] EIP: __phys_addr+0x78/0x90
> > [ 0.196610] Code: 89 c8 5b 5d c3 2e 8d 74 26 00 0f 0b 8d b6 00 00 00 00 89 45 f8 e8 08 a4 1d 00 84 c0 8b 55 f8 74 b0 0f 0b 8d b4 26 00 00 00 00 <0f> 0b 8d b6 00 00 00 00 0f 0b 66 90 8d 74 26 00 2e 8d b4 26 00 00
> > [ 0.200231] EAX: ce383000 EBX: 00031c7c ECX: 31c7c000 EDX: 034ec000
> > [ 0.201452] ESI: c1ed3eec EDI: f21fd101 EBP: c2055f78 ESP: c2055f70
> > [ 0.202693] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210086
> > [ 0.204011] CR0: 80050033 CR2: ffd98000 CR3: 029cf000 CR4: 00000090
> > [ 0.205235] Kernel panic - not syncing: Attempted to kill the idle task!
> > [ 0.206897] ---[ end Kernel panic - not syncing: Attempted to: kill the idle task! ]---
>
> Ok, we're hitting a BUG, not a TLB flushing problem. That's:
>
> BUG_ON(slow_virt_to_phys((void *)x) != phys_addr);
>
> so it's obviously to do with the inverted pte. pgtable-2level.h has
>
> /* No inverted PFNs on 2 level page tables */
>
> and that was definitely an oversight on my behalf. Sorry.
>
> Does this help?
>
> diff --git a/arch/x86/include/asm/kfence.h b/arch/x86/include/asm/kfence.h
> index acf9ffa1a171..310e0193d731 100644
> --- a/arch/x86/include/asm/kfence.h
> +++ b/arch/x86/include/asm/kfence.h
> @@ -42,7 +42,7 @@ static inline bool kfence_protect_page(unsigned long addr, bool protect)
> {
> unsigned int level;
> pte_t *pte = lookup_address(addr, &level);
> - pteval_t val;
> + pteval_t val, new;
>
> if (WARN_ON(!pte || level != PG_LEVEL_4K))
> return false;
> @@ -61,7 +61,8 @@ static inline bool kfence_protect_page(unsigned long addr, bool protect)
> * L1TF-vulnerable PTE (not present, without the high address bits
> * set).
> */
> - set_pte(pte, __pte(~val));
> + new = val ^ _PAGE_PRESENT;
> + set_pte(pte, __pte(flip_protnone_guard(val, new, PTE_PFN_MASK)));
>
> /*
> * If the page was protected (non-present) and we're making it
>
>
>
> Only compile tested. flip_protnone_guard() seems the helper which is a
> nop on 2-level paging.
>
> ~Andrew
Yes, after applying this, it started booting.
Leaving aside the discussion of the fix, I'll just share the test
result for now.
Regards,
Ryusuke Konishi
Powered by blists - more mailing lists