[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAC_TJvc-V9=59p0HG8tmUqQHmp5Sn3E8BV3gHsPj+RHCANd0HA@mail.gmail.com>
Date: Thu, 4 Sep 2025 09:24:43 -0700
From: Kalesh Singh <kaleshsingh@...gle.com>
To: David Hildenbrand <david@...hat.com>
Cc: akpm@...ux-foundation.org, minchan@...nel.org, lorenzo.stoakes@...cle.com,
kernel-team@...roid.com, android-mm@...gle.com,
"Liam R. Howlett" <Liam.Howlett@...cle.com>, Vlastimil Babka <vbabka@...e.cz>,
Mike Rapoport <rppt@...nel.org>, Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>,
Jann Horn <jannh@...gle.com>, Pedro Falcato <pfalcato@...e.de>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: centralize and fix max map count limit checking
On Thu, Sep 4, 2025 at 3:14 AM David Hildenbrand <david@...hat.com> wrote:
>
> On 04.09.25 01:24, Kalesh Singh wrote:
> > The check against the max map count (sysctl_max_map_count) was
> > open-coded in several places. This led to inconsistent enforcement
> > and subtle bugs where the limit could be exceeded.
> >
> > For example, some paths would check map_count > sysctl_max_map_count
> > before allocating a new VMA and incrementing the count, allowing the
> > process to reach sysctl_max_map_count + 1:
> >
> > int do_brk_flags(...)
> > {
> > if (mm->map_count > sysctl_max_map_count)
> > return -ENOMEM;
> >
> > /* We can get here with mm->map_count == sysctl_max_map_count */
> >
> > vma = vm_area_alloc(mm);
> > ...
> > mm->map_count++ /* We've now exceeded the threshold. */
> > }
> >
> > To fix this and unify the logic, introduce a new function,
> > exceeds_max_map_count(), to consolidate the check. All open-coded
> > checks are replaced with calls to this new function, ensuring the
> > limit is applied uniformly and correctly.
> >
> > To improve encapsulation, sysctl_max_map_count is now static to
> > mm/mmap.c. The new helper also adds a rate-limited warning to make
> > debugging applications that exhaust their VMA limit easier.
> >
> > Cc: Andrew Morton <akpm@...ux-foundation.org>
> > Cc: Minchan Kim <minchan@...nel.org>
> > Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>
> > Signed-off-by: Kalesh Singh <kaleshsingh@...gle.com>
> > ---
> > include/linux/mm.h | 11 ++++++++++-
> > mm/mmap.c | 15 ++++++++++++++-
> > mm/mremap.c | 7 ++++---
> > mm/nommu.c | 2 +-
> > mm/util.c | 1 -
> > mm/vma.c | 6 +++---
> > 6 files changed, 32 insertions(+), 10 deletions(-)
> >
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 1ae97a0b8ec7..d4e64e6a9814 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -192,7 +192,16 @@ static inline void __mm_zero_struct_page(struct page *page)
> > #define MAPCOUNT_ELF_CORE_MARGIN (5)
> > #define DEFAULT_MAX_MAP_COUNT (USHRT_MAX - MAPCOUNT_ELF_CORE_MARGIN)
> >
> > -extern int sysctl_max_map_count;
> > +/**
> > + * exceeds_max_map_count - check if a VMA operation would exceed max_map_count
> > + * @mm: The memory descriptor for the process.
> > + * @new_vmas: The number of new VMAs the operation will create.
>
> It's not always a "will" right? At least I remember that this was the
> worst case scenario in some ("may split").
>
> "The number of new VMAs the operation may create in the worst case.
>
Hi Daivd,
You are correct. Cases like mremap account for the worst case (3 way
split on both src and dest). I'll update the description.
Thanks,
Kalesh
>
> --
> Cheers
>
> David / dhildenb
>
Powered by blists - more mailing lists