lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <ZYZoLp9NVCZTchLF@MiWiFi-R3L-srv>
Date: Sat, 23 Dec 2023 12:55:10 +0800
From: Baoquan He <bhe@...hat.com>
To: airlied@...hat.com
Cc: dri-devel@...ts.freedesktop.org, dakr@...hat.com,
	kexec@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: OOM in kdump kernel caused by commit b5bad8c16b9b

Hi David,

Recently, Redhat CKI reported a kdump kernel bootup failure caused by
OOM. After bisect, it only happened after commit b5bad8c16b9b
("nouveau/gsp: move to 535.113.01"). Reverting the commit can avoid the
OOM, kdump kernel can boot up successfully. 

>From debugging, we can see that about extra 100M memory will be costed
when commit b5bad8c16b9b applied on the hpe machine with 2G memory.
Do you know if there's room to improve that to reduce the extra memory
cost?

I have opened a fedora bug to track this OOM, and copy the bug
description here for reference in case someone may not access the bug
easily.

Bug 2253165 - kdump kernel failed to boot up because a big memory chunk is reserved
https://bugzilla.redhat.com/show_bug.cgi?id=2253165
------------------------------------------------------------
CKI reported a failure on beaker machine hp-z210-01.ml3.eng.bos.redhat.com, please see below CKI reports:
https://datawarehouse.cki-project.org/kcidb/tests/10508330

In that failure, crashkernel=256M and succeeded to reserve in 1st kernel. However, in
kdump kernel it failed to boot up when it started to run init process. I set crashkernel=320M to make kdump kernel boot up successfully and vmcore dumping succeeded too.

After adding "rd.memdebug=4 memblock=debug" to kdump kernel cmdline, it appears to have a big chunk of reserved memory in memblock of about 122M. I don't know where it comes from. I doubt firmware stole that chunk from system memory to cause the kdump kernel having oom.


[Tue Dec  5 22:32:38 2023] DMI: Hewlett-Packard HP Z210 Workstation/1587h, BIOS J51 v01.20 09/16/2011
[Tue Dec  5 22:32:38 2023] tsc: Fast TSC calibration using PIT
[Tue Dec  5 22:32:38 2023] tsc: Detected 3092.940 MHz processor
[Tue Dec  5 22:32:38 2023] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[Tue Dec  5 22:32:38 2023] e820: remove [mem 0x000a0000-0x000fffff] usable
[Tue Dec  5 22:32:38 2023] last_pfn = 0x61000 max_arch_pfn = 0x400000000
[Tue Dec  5 22:32:38 2023] MTRR map: 4 entries (3 fixed + 1 variable; max 23), built from 10 variable MTRRs
[Tue Dec  5 22:32:38 2023] x86/PAT: Configuration [0-7]: WB  WC  UC- UC  WB  WP  UC- WT
[Tue Dec  5 22:32:38 2023] x2apic: enabled by BIOS, switching to x2apic ops
[Tue Dec  5 22:32:38 2023] found SMP MP-table at [mem 0x000f4b80-0x000f4b8f]
[Tue Dec  5 22:32:38 2023] memblock_reserve: [0x00000000000f4b80-0x00000000000f4b8f] smp_scan_config+0xca/0x150
[Tue Dec  5 22:32:38 2023] memblock_reserve: [0x00000000000f4b90-0x00000000000f4e4b] smp_scan_config+0x13a/0x150
[Tue Dec  5 22:32:38 2023] memblock_reserve: [0x000000005f600000-0x000000005f610fff] setup_arch+0xd84/0xf10
[Tue Dec  5 22:32:38 2023] memblock_add: [0x0000000000001000-0x000000000008f7ff] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] memblock_add: [0x000000004d0e00b0-0x0000000060ff81cf] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] memblock_add: [0x0000000060ff81d0-0x0000000060ff81ff] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] memblock_add: [0x0000000060ff8200-0x0000000060ffffff] e820__memblock_setup+0x73/0xb0
[Tue Dec  5 22:32:38 2023] MEMBLOCK configuration:
[Tue Dec  5 22:32:38 2023]  memory size = 0x0000000013fae750 reserved size = 0x0000000007b7cc50
[Tue Dec  5 22:32:38 2023]  memory.cnt  = 0x2
[Tue Dec  5 22:32:38 2023]  memory[0x0] [0x0000000000001000-0x000000000008efff], 0x000000000008e000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  memory[0x1] [0x000000004d0e1000-0x0000000060ffffff], 0x0000000013f1f000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved.cnt  = 0x5
[Tue Dec  5 22:32:38 2023]  reserved[0x0]       [0x0000000000000000-0x000000000000ffff], 0x0000000000010000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x1]       [0x000000000008f400-0x00000000000fffff], 0x0000000000070c00 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x2]       [0x0000000057b16000-0x000000005f610fff], 0x0000000007afb000 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x3]       [0x0000000060ff81d0-0x0000000060ff821f], 0x0000000000000050 bytes flags: 0x0
[Tue Dec  5 22:32:38 2023]  reserved[0x4]       [0x0000000060ffe000-0x0000000060ffefff], 0x0000000000001000 bytes flags: 0x0
----------------------------------------------------

Thanks
Baoquan


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ