lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <202541612720-Z_-deOZTOztMXHBh-arkamar@atlas.cz>
Date: Wed, 16 Apr 2025 14:07:20 +0200
From: Petr Vaněk <arkamar@...as.cz>
To: linux-kernel@...r.kernel.org
Cc: Kevin Brodsky <kevin.brodsky@....com>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	x86@...nel.org, xen-devel@...ts.xenproject.org,
	linux-arch@...r.kernel.org
Subject: Regression from a9b3c355c2e6 ("asm-generic: pgalloc: provide generic
 __pgd_{alloc,free}") with CONFIG_DEBUG_VM_PGFLAGS=y and Xen

Hi all,

I have discovered a regression introduced in commit a9b3c355c2e6
("asm-generic: pgalloc: provide generic __pgd_{alloc,free}") [1,2] in
kernel version 6.14. The problem occurs when the x86 kernel is
configured with CONFIG_DEBUG_VM_PGFLAGS=y and is run as a PV Dom0 in Xen
4.19.1. During the startup, the kernel panics with the error log below.

The commit changed PGD allocation path.  In the new implementation
_pgd_alloc allocates memory with __pgd_alloc, which indirectly calls 

  alloc_pages_noprof(gfp | __GFP_COMP, order);

This is in contrast to the old behavior, where __get_free_pages was
used, which indirectly called

  alloc_pages_noprof(gfp_mask & ~__GFP_HIGHMEM, order);

The key difference is that the new allocator can return a compound page.
When xen_pin_page is later called on such a page, it call
TestSetPagePinned function, which internally uses the PF_NO_COMPOUND
macro. This macro enforces VM_BUG_ON_PGFLAGS if PageCompound is true,
triggering the panic when CONFIG_DEBUG_VM_PGFLAGS is enabled.

I am reporting this issue without a patch as I am not sure which part of
the code should be adapted to resolve the regression.

Let me know if I forgot to mention something important.

Cheers,
Petr

[1] a9b3c355c2e6 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}")
[2] https://lkml.kernel.org/r/20250103184415.2744423-6-kevin.brodsky@arm.com

[    0.396244] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[    0.398164] software IO TLB: area num 2.
[    0.449383] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
[    0.452043] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888003450000 pfn:0x344e
[    0.454715] head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
[    0.456908] flags: 0x10000000000040(head|node=0|zone=1)
[    0.458390] raw: 0010000000000040 ffffffff82850ed0 ffffffff82850ed0 0000000000000000
[    0.460621] raw: ffff888003450000 ffff888003454000 00000001ffffffff 0000000000000000
[    0.462807] head: 0010000000000040 ffffffff82850ed0 ffffffff82850ed0 0000000000000000
[    0.464928] head: ffff888003450000 ffff888003454000 00000001ffffffff 0000000000000000
[    0.467106] head: 0010000000000001 ffffea00000d1381 ffffffffffffffff 0000000000000000
[    0.469263] head: 0000000000000002 0000000000000000 00000000ffffffff 0000000000000000
[    0.471430] page dumped because: VM_BUG_ON_PAGE(1 && PageCompound(page))
[    0.473338] ------------[ cut here ]------------
[    0.474764] kernel BUG at include/linux/page-flags.h:527!
[    0.476473] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[    0.478294] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.13.0-rc6-00187-ga9b3c355c2e6 #41
[    0.480971] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-20240910_120124-localhost 04/01/2014
[    0.484218] RIP: e030:xen_pin_page+0x5e/0x180
[    0.485580] Code: f0 48 0f ba 2e 0a 73 24 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 a1 75 e1 00 48 c7 c6 68 70 58 82 48 89 df e8 c2 da 30 00 <0f> 0b 49 bd 00 00 00 00 00 16 00 00 31 ff 41 89 d4 48 b8 00 00 00
[    0.491414] RSP: e02b:ffffffff82803d18 EFLAGS: 00010046
[    0.493064] RAX: 000000000000003c RBX: ffffea00000d1380 RCX: 0000000000000000
[    0.495294] RDX: 0000000000000000 RSI: ffffffff82803b68 RDI: 00000000ffffffff
[    0.497513] RBP: ffffffff82803d48 R08: 00000000ffffdfff R09: ffffffff82925148
[    0.499723] R10: ffffffff828751a0 R11: ffffffff82803a88 R12: ffff88800344e000
[    0.501940] R13: ffff88808344e000 R14: ffff88800344e000 R15: 0000000000000100
[    0.504180] FS:  0000000000000000(0000) GS:ffff88803aa00000(0000) knlGS:0000000000000000
[    0.506699] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.508485] CR2: ffffc90000a00000 CR3: 0000000002842000 CR4: 0000000000000660
[    0.510648] Call Trace:
[    0.511420]  <TASK>
[    0.512071]  ? show_regs.part.0+0x1d/0x30
[    0.513315]  ? __die+0x52/0x90
[    0.514282]  ? die+0x2a/0x50
[    0.515189]  ? do_trap+0x10e/0x120
[    0.516255]  ? do_error_trap+0x6e/0xa0
[    0.517453]  ? xen_pin_page+0x5e/0x180
[    0.518611]  ? exc_invalid_op+0x52/0x70
[    0.519805]  ? xen_pin_page+0x5e/0x180
[    0.520936]  ? asm_exc_invalid_op+0x1b/0x20
[    0.522245]  ? xen_pin_page+0x5e/0x180
[    0.523424]  ? xen_pin_page+0x5e/0x180
[    0.524596]  __xen_pgd_walk+0x2a0/0x2d0
[    0.525816]  ? __pfx_xen_pin_page+0x10/0x10
[    0.527116]  __xen_pgd_pin+0x4d/0x180
[    0.528288]  xen_enter_mmap+0x25/0x40
[    0.529431]  poking_init+0x53/0x130
[    0.530558]  start_kernel+0x4a7/0x6f0
[    0.531726]  x86_64_start_reservations+0x29/0x30
[    0.533197]  xen_start_kernel+0x6cf/0x6e0
[    0.534466]  startup_xen+0x1f/0x20
[    0.535497]  </TASK>
[    0.536193] ---[ end trace 0000000000000000 ]---
[    0.537676] RIP: e030:xen_pin_page+0x5e/0x180
[    0.539049] Code: f0 48 0f ba 2e 0a 73 24 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d e9 a1 75 e1 00 48 c7 c6 68 70 58 82 48 89 df e8 c2 da 30 00 <0f> 0b 49 bd 00 00 00 00 00 16 00 00 31 ff 41 89 d4 48 b8 00 00 00
[    0.544954] RSP: e02b:ffffffff82803d18 EFLAGS: 00010046
[    0.546553] RAX: 000000000000003c RBX: ffffea00000d1380 RCX: 0000000000000000
[    0.548821] RDX: 0000000000000000 RSI: ffffffff82803b68 RDI: 00000000ffffffff
[    0.551022] RBP: ffffffff82803d48 R08: 00000000ffffdfff R09: ffffffff82925148
[    0.553233] R10: ffffffff828751a0 R11: ffffffff82803a88 R12: ffff88800344e000
[    0.555420] R13: ffff88808344e000 R14: ffff88800344e000 R15: 0000000000000100
[    0.557365] FS:  0000000000000000(0000) GS:ffff88803aa00000(0000) knlGS:0000000000000000
[    0.559528] CS:  e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.561050] CR2: ffffc90000a00000 CR3: 0000000002842000 CR4: 0000000000000660
[    0.562970] Kernel panic - not syncing: Attempted to kill the idle task!
(XEN) Hardware Dom0 crashed: rebooting machine in 5 seconds.
QEMU: Terminated


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ