[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <mftebk5inxamd52k46frhq2llopoiiewsgdkrjbamg4yukyhqw@vf4jzz6lmgcu>
Date: Fri, 30 Aug 2024 16:04:05 +0100
From: Pedro Falcato <pedro.falcato@...il.com>
To: Petr Špaček <pspacek@....org>
Cc: Lorenzo Stoakes <lorenzo.stoakes@...cle.com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, Vlastimil Babka <vbabka@...e.cz>,
Liam Howlett <liam.howlett@...cle.com>
Subject: Re: [PATCH RFC] mm: mmap: Change DEFAULT_MAX_MAP_COUNT to INT_MAX
On Fri, Aug 30, 2024 at 04:28:33PM GMT, Petr Špaček wrote:
> Now I understand your concern. From the docs and code comments I've seen it
> was not clear that the limit serves _another_ purpose than mere
> compatibility shim for old ELF tools.
>
> > It is a NACK, but it's a NACK because of the limit being so high.
> >
> > With steam I believe it is a product of how it performs allocations, and
> > unfortunately this causes it to allocate quite a bit more than you would
> > expect.
>
> FTR select non-game applications:
>
> ElasticSearch and OpenSearch insist on at least 262144.
> DNS server BIND 9.18.28 linked to jemalloc 5.2.1 was observed with usage
> around 700000.
> OpenJDK GC sometimes weeps about values < 737280.
> SAP docs I was able to access use 1000000.
> MariaDB is being tested by their QA with 1048576.
> Fedora, Ubuntu, NixOS, and Arch distros went with value 1048576.
>
> Is it worth sending a patch with the default raised to 1048576?
>
>
> > With jemalloc() that seems strange, perhaps buggy behaviour?
>
> Good question. In case of BIND DNS server, jemalloc handles mmap() and we
> keep statistics about bytes requested from malloc().
>
> When we hit max_map_count limit the
> (sum of not-yet-freed malloc(size)) / (vm.max_map_count)
> gives average size of mmaped block ~ 100 k.
>
> Is 100 k way too low / does it indicate a bug? It does not seem terrible to
> me - the application is handling ~ 100-1500 B packets at rate somewhere
> between 10-200 k packets per second so it's expected it does lots of small
> short lived allocations.
>
> A complicating factor is that the process itself does not see the current
> counter value (unless BPF is involved) so it's hard to monitor this until
> the limit is hit.
Can you get us a dump of the /proc/<pid>/maps? It'd be interesting to see how
exactly you're hitting this.
--
Pedro
Powered by blists - more mailing lists