linux-kernel - Re: WARNING in memory_failure() at include/linux/huge

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b9e89c85-807c-4a8b-aa5c-8c1325e75445@kernel.org>
Date: Wed, 4 Feb 2026 18:12:39 +0100
From: "David Hildenbrand (arm)" <david@...nel.org>
To: 是参差 <shicenci@...il.com>,
 "linux-mm@...ck.org" <linux-mm@...ck.org>
Cc: "linmiaohe@...wei.com" <linmiaohe@...wei.com>,
 "akpm@...ux-foundation.org" <akpm@...ux-foundation.org>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
 Zi Yan <ziy@...dia.com>, Matthew Wilcox <willy@...radead.org>
Subject: Re: WARNING in memory_failure() at include/linux/huge_mm.h:635
 triggered

On 2/4/26 13:49, 是参差 wrote:
> Hi,
> I’m reporting a reproducible WARNING triggered in the hwpoison / memory_failure path when injecting a hardware-poison event via madvise(MADV_HWPOISON).
> 
> The warning is triggered by a syzkaller C reproducer that:
> maps a file-backed region with MAP_FIXED, touches related VMAs, and then
> calls madvise() with MADV_HWPOISON over a large range.
> The kernel reports a VM_WARN_ON_ONCE_FOLIO(1) from memory_failure() and points to include/linux/huge_mm.h:635, suggesting an unexpected folio/page state encountered while handling a poisoned compound/huge folio.
> 
> The target page appears to be a compound head page (order:3) already marked hwpoison. memory_failure() seems to reach a branch that unconditionally warns (VM_WARN_ON_ONCE_FOLIO(1) at include/linux/huge_mm.h:635), which usually indicates an “unreachable”/unexpected folio type or state transition in the huge/compound folio handling logic during hwpoison processing.
> 
> This looks like a kernel-side invariant violation rather than a pure userspace misuse, since the warning is emitted from an unconditional VM_WARN_ON_ONCE_FOLIO(1) site.
> 
> Reproducer:
> C reproducer: https://pastebin.com/raw/UxennX2B
> console output: https://pastebin.com/raw/wrhKRwZY
> kernel config: https://pastebin.com/raw/dP93yBLn
> 
> Kernel:
> 
> HEAD commit: 63804fed149a6750ffd28610c5c1c98cce6bd377
> 
>   git tree: torvalds/linux
> 
> kernel version: 6.19.0-rc7  (QEMU Ubuntu 24.10)

@Zi Yan, this is weird.

We run into the VM_WARN_ON_ONCE_FOLIO(1, folio); in min_order_for_split(),
which is only active with !CONFIG_TRANSPARENT_HUGEPAGE.

But how do we get a large folio in that case? folio_test_large(folio) succeeded.

I think we rules out hugetlb before in that function.


Looking into the full console output, this is an order-3 folio (fully mapped).

How do we end up with a large folio here? I am only aware of that happening when something would
allocate an order-3 compound page (not a folio) and map it into the page tables. Yes, that
is nasty and can still happen, not sure yet though whether that is really what the reproducer
triggers.


[  451.810860] Injecting memory failure for pfn 0xfe28 at process virtual address 0x200000000000
[  451.812878] page: refcount:10 mapcount:1 mapping:0000000000000000 index:0xffff88800fe2e600 pfn:0xfe28
[  451.814740] head: order:3 mapcount:8 entire_mapcount:0 nr_pages_mapped:8 pincount:0
[  451.816263] flags: 0x200044(referenced|head|hwpoison|zone=0)
[  451.817414] raw: 0000000000200044 0000000000000000 dead000000000122 0000000000000000
[  451.818924] raw: ffff88800fe2e600 0000000000000000 0000000a00000000 0000000000000000
[  451.820422] head: 0000000000200044 0000000000000000 dead000000000122 0000000000000000
[  451.821835] head: ffff88800fe2e600 0000000000000000 0000000a00000000 0000000000000000
[  451.823276] head: 0000000000000003 ffffea00003f8a01 0000000800000007 00000000ffffffff
[  451.824701] head: ffff88800fe29e00 0000000000000000 0000000000000000 0000000000000008
[  451.826113] page dumped because: VM_WARN_ON_ONCE_FOLIO(1)

> 
> 
> head: 0000000000000003 ffffea00003f8a01 0000000800000007 00000000ffffffff
> head: ffff88800fe29e00 0000000000000000 0000000000000000 0000000000000008
> page dumped because: VM_WARN_ON_ONCE_FOLIO(1)
> ------------[ cut here ]------------
> WARNING: include/linux/huge_mm.h:635 at min_order_for_split include/linux/huge_mm.h:635 [inline], CPU#0: syz.3.7564/25556
> WARNING: include/linux/huge_mm.h:635 at min_order_for_split include/linux/huge_mm.h:633 [inline], CPU#0: syz.3.7564/25556
> WARNING: include/linux/huge_mm.h:635 at memory_failure+0x22e8/0x2950 mm/memory-failure.c:2434, CPU#0: syz.3.7564/25556
> CPU: 0 UID: 0 PID: 25556 Comm: syz.3.7564 Not tainted 6.19.0-rc7 #1 VOLUNTARY
> Hardware name: QEMU Ubuntu 24.10 PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> RIP: 0010:min_order_for_split include/linux/huge_mm.h:635 [inline]
> RIP: 0010:min_order_for_split include/linux/huge_mm.h:633 [inline]
> RIP: 0010:memory_failure+0x22e8/0x2950 mm/memory-failure.c:2434
> Code: ff 84 db 0f 85 f1 f5 ff ff e9 36 fe ff ff e8 3f 55 ce ff 48 c7 c6 e0 f7 ee 85 4c 89 f7 e8 90 b6 ed ff c6 05 13 35 04 06 01 90 <0f> 0b 90 e9 aa ee ff ff e8 eb 15 fe ff e9 65 e1 ff ff e8 a1 15 fe
> RSP: 0018:ffff888000b7fa00 EFLAGS: 00010216
> RAX: 00000000000066f1 RBX: 0000000000000000 RCX: ffffc90004fbb000
> RDX: 0000000000080000 RSI: ffff888029a3c500 RDI: 0000000000000002
> RBP: ffffea00003f8a00 R08: fffffbfff0ddc501 R09: ffffffff819105e0
> R10: 0000000000000001 R11: ffff888000b7f7e7 R12: 000000000000fe28
> R13: ffffea00003f8a00 R14: ffffea00003f8a00 R15: ffffea00003f8a08
> FS:  00007f3d36d7f6c0(0000) GS:0000000000000000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f3d384a3828 CR3: 00000000589a9000 CR4: 0000000000350ef0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
> Call Trace:
>   <TASK>
>   madvise_inject_error mm/madvise.c:1489 [inline]
>   madvise_do_behavior.part.0+0x137/0x3c0 mm/madvise.c:1927
>   madvise_do_behavior+0x41d/0x5d0 mm/madvise.c:979
>   do_madvise+0x134/0x1b0 mm/madvise.c:2030
>   __do_sys_madvise mm/madvise.c:2039 [inline]
>   __se_sys_madvise mm/madvise.c:2037 [inline]
>   __x64_sys_madvise+0xa8/0x110 mm/madvise.c:2037
>   do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>   do_syscall_64+0xa9/0x320 arch/x86/entry/syscall_64.c:94
>   entry_SYSCALL_64_after_hwframe+0x4b/0x53
> RIP: 0033:0x7f3d3831ebe9
> Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007f3d36d7f038 EFLAGS: 00000246 ORIG_RAX: 000000000000001c
> RAX: ffffffffffffffda RBX: 00007f3d38555fa0 RCX: 00007f3d3831ebe9
> RDX: 0000000000000064 RSI: 0000000000600000 RDI: 0000200000000000
> RBP: 00007f3d383a1e19 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007f3d38556038 R14: 00007f3d38555fa0 R15: 00007ffe8f52e9d8
>   </TASK>
> ---[ end trace 0000000000000000 ]---


-- 
Cheers,

David