[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7eh332fqbhxak2afcwt6mwzaxu7s3dj2tx4hrtt7ivo3oxovcg@avz6uniwdzpi>
Date: Thu, 4 Sep 2025 16:24:56 +0100
From: Pedro Falcato <pfalcato@...e.de>
To: Kalesh Singh <kaleshsingh@...gle.com>
Cc: akpm@...ux-foundation.org, minchan@...nel.org,
lorenzo.stoakes@...cle.com, kernel-team@...roid.com, android-mm@...gle.com,
David Hildenbrand <david@...hat.com>, "Liam R. Howlett" <Liam.Howlett@...cle.com>,
Vlastimil Babka <vbabka@...e.cz>, Mike Rapoport <rppt@...nel.org>,
Suren Baghdasaryan <surenb@...gle.com>, Michal Hocko <mhocko@...e.com>, Jann Horn <jannh@...gle.com>,
linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: centralize and fix max map count limit checking
On Wed, Sep 03, 2025 at 08:01:50PM -0700, Kalesh Singh wrote:
> On Wed, Sep 3, 2025 at 4:46 PM Pedro Falcato <pfalcato@...e.de> wrote:
> >
<snip>
> > >
> > > /* Too many mappings? */
> > > - if (mm->map_count > sysctl_max_map_count)
> > > + if (exceeds_max_map_count(mm, 0))
> > > return -ENOMEM;
> >
> > If the brk example is incorrect, isn't this also wrong? /me is confused
>
> Ahh you are right, this will also go over by 1 once we return from
> mmap_region(). I'll batch this with the do_brk_flags() fix.
>
> > >
> > > /*
> > > @@ -1504,6 +1504,19 @@ struct vm_area_struct *_install_special_mapping(
> > > int sysctl_legacy_va_layout;
> > > #endif
> > >
> > > +static int sysctl_max_map_count __read_mostly = DEFAULT_MAX_MAP_COUNT;
> > > +
> > > +bool exceeds_max_map_count(struct mm_struct *mm, unsigned int new_vmas)
> > > +{
> > > + if (unlikely(mm->map_count + new_vmas > sysctl_max_map_count)) {
> > > + pr_warn_ratelimited("%s (%d): Map count limit %u exceeded\n",
> > > + current->comm, current->pid,
> > > + sysctl_max_map_count);
> >
> > I'm not entirely sold on the map count warn, even if it's rate limited. It
> > sounds like something you can hit in nasty edge cases and nevertheless flood
> > your dmesg (more frustrating if you can't fix the damn program).
>
> I don't feel strongly about this, I can drop it in the next revision.
FWIW, I don't feel strongly about it either, and I would not mind if there's a
way to shut it up (cmdline, or even sysctl knob?). Let's see if anyone has a
stronger opinion.
>
> >
> > In any case, if we are to make the checks more strict, we should also add
> > asserts around the place. Though there's a little case in munmap() we can indeed
> > go over temporarily, on purpose.
>
> To confirm, do you mean we should WARN_ON() checks where map count is
> being incremented?
Yes, _possibly_ gated off by CONFIG_DEBUG_VM.
>
> > Though there's a little case in munmap() we can indeed
> > go over temporarily, on purpose.
>
> For the 3 way split we need 1 additional VMA after munmap completed as
> one of the 3 gets unmapped. The check is done in the caller beforehand
> as __split_vma() explicitly doesn't check map_count. Though if we add
> asserts we'll need a variant of vma_complete() or the like that
> doesn't enforce the threshold.
Right, it might get a little hairy, which is partly why I'm not super into
the idea. But definitely worth considering as a way to help prevent these
sorts of problems in the future.
--
Pedro
Powered by blists - more mailing lists