lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240830095636.572947-1-pspacek@isc.org>
Date: Fri, 30 Aug 2024 11:56:36 +0200
From: Petr Spacek <pspacek@....org>
To: linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Cc: Petr Spacek <pspacek@....org>
Subject: [PATCH RFC] mm: mmap: Change DEFAULT_MAX_MAP_COUNT to INT_MAX

From: Petr Spacek <pspacek@....org>

Raise default sysctl vm.max_map_count to INT_MAX, which effectively
disables the limit for all sane purposes. The sysctl is kept around in
case there is some use-case for this limit.

The old default value of vm.max_map_count=65530 provided compatibility
with ELF format predating year 2000 and with binutils predating 2010. At
the same time the old default caused issues with applications deployed
in 2024.

State since 2012: Linux 3.2.0 correctly generates coredump from a
process with 100 000 mmapped files. GDB 7.4.1, binutils 2.22 work with
this coredump fine and can actually read data from the mmaped addresses.

Signed-off-by: Petr Spacek <pspacek@....org>
---

Downstream distributions started to override the default a while ago.
Individual distributions are summarized at the end of this message:
https://lists.archlinux.org/archives/list/arch-dev-public@lists.archlinux.org/thread/5GU7ZUFI25T2IRXIQ62YYERQKIPE3U6E/

Please note it's not only games in emulator which hit this default
limit. Larger instances of server applications are also suffering from
this. Couple examples here:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2057792/comments/24

SAP documentation behind paywall also mentions this limit:
https://service.sap.com/sap/support/notes/2002167

And finally, it is also an issue for BIND DNS server compiled against
jemalloc, which is what brought me here.

System V gABI draft dated 2000-07-17 already extended the ELF numbering:
https://www.sco.com/developers/gabi/2000-07-17/ch4.sheader.html

binutils support is in commit ecd12bc14d85421fcf992cda5af1d534cc8736e0
dated 2010-01-19. IIUC this goes a bit beyond what is described in the
gABI document and extends ELF's e_phnum.

Linux coredumper support is in commit
8d9032bbe4671dc481261ccd4e161cd96e54b118 dated 2010-03-06.

As mentioned above, this all works for the last 12 years and the
conservative limit seems to do more harm than good.

 include/linux/mm.h | 21 +++++++++------------
 1 file changed, 9 insertions(+), 12 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6549d0979..3e1ed3b80 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -178,22 +178,19 @@ static inline void __mm_zero_struct_page(struct page *page)
 
 /*
  * Default maximum number of active map areas, this limits the number of vmas
- * per mm struct. Users can overwrite this number by sysctl but there is a
- * problem.
+ * per mm struct. Users can overwrite this number by sysctl. Historically
+ * this limit was a compatibility measure for ELF format predating year 2000.
  *
  * When a program's coredump is generated as ELF format, a section is created
- * per a vma. In ELF, the number of sections is represented in unsigned short.
- * This means the number of sections should be smaller than 65535 at coredump.
- * Because the kernel adds some informative sections to a image of program at
- * generating coredump, we need some margin. The number of extra sections is
- * 1-3 now and depends on arch. We use "5" as safe margin, here.
+ * per a vma. In ELF before year 2000, the number of sections was represented
+ * as unsigned short e_shnum. This means the number of sections should be
+ * smaller than 65535 at coredump.
  *
- * ELF extended numbering allows more than 65535 sections, so 16-bit bound is
- * not a hard limit any more. Although some userspace tools can be surprised by
- * that.
+ * ELF extended numbering was added into System V gABI spec around 2000.
+ * It allows more than 65535 sections, so 16-bit bound is not a hard limit any
+ * more.
  */
-#define MAPCOUNT_ELF_CORE_MARGIN	(5)
-#define DEFAULT_MAX_MAP_COUNT	(USHRT_MAX - MAPCOUNT_ELF_CORE_MARGIN)
+#define DEFAULT_MAX_MAP_COUNT	INT_MAX
 
 extern int sysctl_max_map_count;
 

base-commit: d5d547aa7b51467b15d9caa86b116f8c2507c72a
-- 
2.46.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ