[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190412200629.GA24377@tower.DHCP.thefacebook.com>
Date: Fri, 12 Apr 2019 20:06:34 +0000
From: Roman Gushchin <guro@...com>
To: Johannes Weiner <hannes@...xchg.org>
CC: Andrew Morton <akpm@...ux-foundation.org>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Kernel Team <Kernel-team@...com>
Subject: Re: [PATCH] mm: fix false-positive OVERCOMMIT_GUESS failures
On Fri, Apr 12, 2019 at 03:14:18PM -0400, Johannes Weiner wrote:
> With the default overcommit==guess we occasionally run into mmap
> rejections despite plenty of memory that would get dropped under
> pressure but just isn't accounted reclaimable. One example of this is
> dying cgroups pinned by some page cache. A previous case was auxiliary
> path name memory associated with dentries; we have since annotated
> those allocations to avoid overcommit failures (see d79f7aa496fc ("mm:
> treat indirectly reclaimable memory as free in overcommit logic")).
>
> But trying to classify all allocated memory reliably as reclaimable
> and unreclaimable is a bit of a fool's errand. There could be a myriad
> of dependencies that constantly change with kernel versions.
>
> It becomes even more questionable of an effort when considering how
> this estimate of available memory is used: it's not compared to the
> system-wide allocated virtual memory in any way. It's not even
> compared to the allocating process's address space. It's compared to
> the single allocation request at hand!
>
> So we have an elaborate left-hand side of the equation that tries to
> assess the exact breathing room the system has available down to a
> page - and then compare it to an isolated allocation request with no
> additional context. We could fail an allocation of N bytes, but for
> two allocations of N/2 bytes we'd do this elaborate dance twice in a
> row and then still let N bytes of virtual memory through. This doesn't
> make a whole lot of sense.
>
> Let's take a step back and look at the actual goal of the
> heuristic. From the documentation:
>
> Heuristic overcommit handling. Obvious overcommits of address
> space are refused. Used for a typical system. It ensures a
> seriously wild allocation fails while allowing overcommit to
> reduce swap usage. root is allowed to allocate slightly more
> memory in this mode. This is the default.
>
> If all we want to do is catch clearly bogus allocation requests
> irrespective of the general virtual memory situation, the physical
> memory counter-part doesn't need to be that complicated, either.
>
> When in GUESS mode, catch wild allocations by comparing their request
> size to total amount of ram and swap in the system.
>
> Signed-off-by: Johannes Weiner <hannes@...xchg.org>
My 2c here: any kinds of percpu counters and percpu data is accounted
as unreclaimable and can alter the calculation significantly.
This is a special problem on hosts, which were idle for some time.
Without any memory pressure, kernel caches do occupy most of the memory,
so than a following attempt to start a workload fails.
With a big pleasure:
Acked-by: Roman Gushchin <guro@...com>
Thanks!
Powered by blists - more mailing lists