lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <qa7b7pvrycejnn6pjytxysu57xckhexupjrzefmk4j5hlaxka3@ayeg2vzpfe3r>
Date: Thu, 4 Sep 2025 00:46:34 +0100
From: Pedro Falcato <pfalcato@...e.de>
To: Kalesh Singh <kaleshsingh@...gle.com>
Cc: akpm@...ux-foundation.org, minchan@...nel.org, 
	lorenzo.stoakes@...cle.com, kernel-team@...roid.com, android-mm@...gle.com, 
	David Hildenbrand <david@...hat.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>, 
	Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>, 
	Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>, Jann Horn <jannh@...gle.com>, 
	linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: centralize and fix max map count limit checking

On Wed, Sep 03, 2025 at 04:24:35PM -0700, Kalesh Singh wrote:
> The check against the max map count (sysctl_max_map_count) was
> open-coded in several places. This led to inconsistent enforcement
> and subtle bugs where the limit could be exceeded.
> 
> For example, some paths would check map_count > sysctl_max_map_count
> before allocating a new VMA and incrementing the count, allowing the
> process to reach sysctl_max_map_count + 1:
> 
>     int do_brk_flags(...)
>     {
>         if (mm->map_count > sysctl_max_map_count)
>             return -ENOMEM;
> 
>         /* We can get here with mm->map_count == sysctl_max_map_count */
> 
>         vma = vm_area_alloc(mm);
>         ...
>         mm->map_count++   /* We've now exceeded the threshold. */
>     }

I think this should be fixed separately, and sent for stable.

> 
> To fix this and unify the logic, introduce a new function,
> exceeds_max_map_count(), to consolidate the check. All open-coded
> checks are replaced with calls to this new function, ensuring the
> limit is applied uniformly and correctly.

Thanks! In general I like the idea.

> 
> To improve encapsulation, sysctl_max_map_count is now static to
> mm/mmap.c. The new helper also adds a rate-limited warning to make
> debugging applications that exhaust their VMA limit easier.
> 
> Cc: Andrew Morton <akpm@...ux-foundation.org>
> Cc: Minchan Kim <minchan@...nel.org>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> Signed-off-by: Kalesh Singh <kaleshsingh@...gle.com>
> ---
>  include/linux/mm.h | 11 ++++++++++-
>  mm/mmap.c          | 15 ++++++++++++++-
>  mm/mremap.c        |  7 ++++---
>  mm/nommu.c         |  2 +-
>  mm/util.c          |  1 -
>  mm/vma.c           |  6 +++---
>  6 files changed, 32 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 1ae97a0b8ec7..d4e64e6a9814 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -192,7 +192,16 @@ static inline void __mm_zero_struct_page(struct page *page)
>  #define MAPCOUNT_ELF_CORE_MARGIN	(5)
>  #define DEFAULT_MAX_MAP_COUNT	(USHRT_MAX - MAPCOUNT_ELF_CORE_MARGIN)
>  
> -extern int sysctl_max_map_count;
> +/**
> + * exceeds_max_map_count - check if a VMA operation would exceed max_map_count
> + * @mm: The memory descriptor for the process.
> + * @new_vmas: The number of new VMAs the operation will create.
> + *
> + * Returns true if the operation would cause the number of VMAs to exceed
> + * the sysctl_max_map_count limit, false otherwise. A rate-limited warning
> + * is logged if the limit is exceeded.
> + */
> +extern bool exceeds_max_map_count(struct mm_struct *mm, unsigned int new_vmas);

No new "extern" in func declarations please.

>  
>  extern unsigned long sysctl_user_reserve_kbytes;
>  extern unsigned long sysctl_admin_reserve_kbytes;
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 7306253cc3b5..693a0105e6a5 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -374,7 +374,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
>  		return -EOVERFLOW;
>  
>  	/* Too many mappings? */
> -	if (mm->map_count > sysctl_max_map_count)
> +	if (exceeds_max_map_count(mm, 0))
>  		return -ENOMEM;

If the brk example is incorrect, isn't this also wrong? /me is confused
>  
>  	/*
> @@ -1504,6 +1504,19 @@ struct vm_area_struct *_install_special_mapping(
>  int sysctl_legacy_va_layout;
>  #endif
>  
> +static int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
> +
> +bool exceeds_max_map_count(struct mm_struct *mm, unsigned int new_vmas)
> +{
> +	if (unlikely(mm->map_count + new_vmas > sysctl_max_map_count)) {
> +		pr_warn_ratelimited("%s (%d): Map count limit %u exceeded\n",
> +				    current->comm, current->pid,
> +				    sysctl_max_map_count);

I'm not entirely sold on the map count warn, even if it's rate limited. It
sounds like something you can hit in nasty edge cases and nevertheless flood
your dmesg (more frustrating if you can't fix the damn program).

In any case, if we are to make the checks more strict, we should also add
asserts around the place. Though there's a little case in munmap() we can indeed
go over temporarily, on purpose.

-- 
Pedro

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ