lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <FA5F6719-8824-4B04-803E-82990E65E627@akamai.com>
Date: Wed, 15 May 2024 17:32:27 +0000
From: "Chaney, Ben" <bchaney@...mai.com>
To: "ardb@...nel.org" <ardb@...nel.org>,
        "gregkh@...uxfoundation.org"
	<gregkh@...uxfoundation.org>,
        "linux-efi@...r.kernel.org"
	<linux-efi@...r.kernel.org>,
        "stable@...r.kernel.org"
	<stable@...r.kernel.org>
CC: "bp@...en8.de" <bp@...en8.de>,
        "dave.hansen@...ux.intel.com"
	<dave.hansen@...ux.intel.com>,
        "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>,
        "mingo@...hat.com" <mingo@...hat.com>,
        "tglx@...utronix.de" <tglx@...utronix.de>,
        "Tottenham, Max"
	<mtottenh@...mai.com>,
        "Hunt, Joshua" <johunt@...mai.com>,
        "Galaxy, Michael"
	<mgalaxy@...mai.com>
Subject: Regression in 6.1.81: Missing memory in pmem device

Hello,
                I encountered an issue when upgrading to 6.1.89 from 6.1.77. This upgrade caused a breakage in emulated persistent memory. Significant amounts of memory are missing from a pmem device:

fdisk -l /dev/pmem*
Disk /dev/pmem0: 355.9 GiB, 382117871616 bytes, 746323968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

Disk /dev/pmem1: 25.38 GiB, 27246198784 bytes, 53215232 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes

	The memmap parameter that created these pmem devices is “memmap=364416M!28672M,367488M!419840M”, which should cause a much larger amount of memory to be allocated to /dev/pmem1. The amount of missing memory and the device it is missing from is randomized on each reboot. There is some amount of memory missing in almost all cases, but not 100% of the time. Notably, the memory that is missing from these devices is not reclaimed by the system for general use. This system in question has 768GB of memory split evenly across two NUMA nodes.

	When the error occurs, there are also the following error messages showing up in dmesg:

[    5.318317] nd_pmem namespace1.0: [mem 0x5c2042c000-0x5ff7ffffff flags 0x200] misaligned, unable to map
[    5.335073] nd_pmem: probe of namespace1.0 failed with error -95

	Bisection implicates 2dfaeac3f38e4e550d215204eedd97a061fdc118 as the patch that first caused the issue. I believe the cause of the issue is that the EFI stub is randomizing the location of the decompressed kernel without accounting for the memory map, and it is clobbering some of the memory that has been reserved for pmem.

Thank you,
	Ben Chaney




Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ