[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202541612720-Z_-deOZTOztMXHBh-arkamar@atlas.cz>
Date: Wed, 16 Apr 2025 14:07:20 +0200
From: Petr Vaněk <arkamar@...as.cz>
To: linux-kernel@...r.kernel.org
Cc: Kevin Brodsky <kevin.brodsky@....com>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
x86@...nel.org, xen-devel@...ts.xenproject.org,
linux-arch@...r.kernel.org
Subject: Regression from a9b3c355c2e6 ("asm-generic: pgalloc: provide generic
__pgd_{alloc,free}") with CONFIG_DEBUG_VM_PGFLAGS=y and Xen
Hi all,
I have discovered a regression introduced in commit a9b3c355c2e6
("asm-generic: pgalloc: provide generic __pgd_{alloc,free}") [1,2] in
kernel version 6.14. The problem occurs when the x86 kernel is
configured with CONFIG_DEBUG_VM_PGFLAGS=y and is run as a PV Dom0 in Xen
4.19.1. During the startup, the kernel panics with the error log below.
The commit changed PGD allocation path. In the new implementation
_pgd_alloc allocates memory with __pgd_alloc, which indirectly calls
alloc_pages_noprof(gfp | __GFP_COMP, order);
This is in contrast to the old behavior, where __get_free_pages was
used, which indirectly called
alloc_pages_noprof(gfp_mask & ~__GFP_HIGHMEM, order);
The key difference is that the new allocator can return a compound page.
When xen_pin_page is later called on such a page, it call
TestSetPagePinned function, which internally uses the PF_NO_COMPOUND
macro. This macro enforces VM_BUG_ON_PGFLAGS if PageCompound is true,
triggering the panic when CONFIG_DEBUG_VM_PGFLAGS is enabled.
I am reporting this issue without a patch as I am not sure which part of
the code should be adapted to resolve the regression.
Let me know if I forgot to mention something important.
Cheers,
Petr
[1] a9b3c355c2e6 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}")
[2] https://lkml.kernel.org/r/20250103184415.2744423-6-kevin.brodsky@arm.com
[ 0.396244] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[ 0.398164] software IO TLB: area num 2.
[ 0.449383] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[ 0.452043] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888003450000 pfn:0x344e
[ 0.454715] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[ 0.456908] flags: 0x10000000000040(head|node=0|zone=1)
[ 0.458390] raw: 0010000000000040 ffffffff82850ed0 ffffffff82850ed0 0000000000000000
[ 0.460621] raw: ffff888003450000 ffff888003454000 00000001ffffffff 0000000000000000
[ 0.462807] head: 0010000000000040 ffffffff82850ed0 ffffffff82850ed0 0000000000000000
[ 0.464928] head: ffff888003450000 ffff888003454000 00000001ffffffff 0000000000000000
[ 0.467106] head: 0010000000000001 ffffea00000d1381 ffffffffffffffff 0000000000000000
[ 0.469263] head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
[ 0.471430] page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
[ 0.473338] ------------[ cut here ]------------
[ 0.474764] kernel BUG at include/linux/page-flags.h:527!
[ 0.476473] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 0.478294] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.13.0-rc6-00187-ga9b3c355c2e6 #41
[ 0.480971] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-20240910_120124-localhost 04/01/2014
[ 0.484218] RIP: e030:xen_pin_page+0x5e/0x180
[ 0.485580] Code: f0 48 0f ba 2e 0a 73 24 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 a1 75 e1 00 48 c7 c6 68 70 58 82 48 89 df e8 c2 da 30 00 <0f> 0b 49 bd 00 00 00 00 00 16 00 00 31 ff 41 89 d4 48 b8 00 00 00
[ 0.491414] RSP: e02b:ffffffff82803d18 EFLAGS: 00010046
[ 0.493064] RAX: 000000000000003c RBX: ffffea00000d1380 RCX: 0000000000000000
[ 0.495294] RDX: 0000000000000000 RSI: ffffffff82803b68 RDI: 00000000ffffffff
[ 0.497513] RBP: ffffffff82803d48 R08: 00000000ffffdfff R09: ffffffff82925148
[ 0.499723] R10: ffffffff828751a0 R11: ffffffff82803a88 R12: ffff88800344e000
[ 0.501940] R13: ffff88808344e000 R14: ffff88800344e000 R15: 0000000000000100
[ 0.504180] FS: 0000000000000000(0000) GS:ffff88803aa00000(0000) knlGS:0000000000000000
[ 0.506699] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.508485] CR2: ffffc90000a00000 CR3: 0000000002842000 CR4: 0000000000000660
[ 0.510648] Call Trace:
[ 0.511420] <TASK>
[ 0.512071] ? show_regs.part.0+0x1d/0x30
[ 0.513315] ? __die+0x52/0x90
[ 0.514282] ? die+0x2a/0x50
[ 0.515189] ? do_trap+0x10e/0x120
[ 0.516255] ? do_error_trap+0x6e/0xa0
[ 0.517453] ? xen_pin_page+0x5e/0x180
[ 0.518611] ? exc_invalid_op+0x52/0x70
[ 0.519805] ? xen_pin_page+0x5e/0x180
[ 0.520936] ? asm_exc_invalid_op+0x1b/0x20
[ 0.522245] ? xen_pin_page+0x5e/0x180
[ 0.523424] ? xen_pin_page+0x5e/0x180
[ 0.524596] __xen_pgd_walk+0x2a0/0x2d0
[ 0.525816] ? __pfx_xen_pin_page+0x10/0x10
[ 0.527116] __xen_pgd_pin+0x4d/0x180
[ 0.528288] xen_enter_mmap+0x25/0x40
[ 0.529431] poking_init+0x53/0x130
[ 0.530558] start_kernel+0x4a7/0x6f0
[ 0.531726] x86_64_start_reservations+0x29/0x30
[ 0.533197] xen_start_kernel+0x6cf/0x6e0
[ 0.534466] startup_xen+0x1f/0x20
[ 0.535497] </TASK>
[ 0.536193] ---[ end trace 0000000000000000 ]---
[ 0.537676] RIP: e030:xen_pin_page+0x5e/0x180
[ 0.539049] Code: f0 48 0f ba 2e 0a 73 24 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 a1 75 e1 00 48 c7 c6 68 70 58 82 48 89 df e8 c2 da 30 00 <0f> 0b 49 bd 00 00 00 00 00 16 00 00 31 ff 41 89 d4 48 b8 00 00 00
[ 0.544954] RSP: e02b:ffffffff82803d18 EFLAGS: 00010046
[ 0.546553] RAX: 000000000000003c RBX: ffffea00000d1380 RCX: 0000000000000000
[ 0.548821] RDX: 0000000000000000 RSI: ffffffff82803b68 RDI: 00000000ffffffff
[ 0.551022] RBP: ffffffff82803d48 R08: 00000000ffffdfff R09: ffffffff82925148
[ 0.553233] R10: ffffffff828751a0 R11: ffffffff82803a88 R12: ffff88800344e000
[ 0.555420] R13: ffff88808344e000 R14: ffff88800344e000 R15: 0000000000000100
[ 0.557365] FS: 0000000000000000(0000) GS:ffff88803aa00000(0000) knlGS:0000000000000000
[ 0.559528] CS: e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 0.561050] CR2: ffffc90000a00000 CR3: 0000000002842000 CR4: 0000000000000660
[ 0.562970] Kernel panic - not syncing: Attempted to kill the idle task!
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
QEMU: Terminated
Powered by blists - more mailing lists