linux-kernel - [PATCH RESEND] mm: don't raise MEMCG_OOM event due to failed high-order allocation

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180917230846.31027-1-guro@fb.com>
Date:   Mon, 17 Sep 2018 23:10:59 +0000
From:   Roman Gushchin <guro@...com>
To:     "linux-mm@...ck.org" <linux-mm@...ck.org>
CC:     "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Kernel Team <Kernel-team@...com>, Roman Gushchin <guro@...com>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>
Subject: [PATCH RESEND] mm: don't raise MEMCG_OOM event due to failed
 high-order allocation

The memcg OOM killer is never invoked due to a failed high-order
allocation, however the MEMCG_OOM event can be raised.

As shown below, it can happen under conditions, which are very
far from a real OOM: e.g. there is plenty of clean pagecache
and low memory pressure.

There is no sense in raising an OOM event in such a case,
as it might confuse a user and lead to wrong and excessive actions.

Let's look at the charging path in try_caharge(). If the memory usage
is about memory.max, which is absolutely natural for most memory cgroups,
we try to reclaim some pages. Even if we were able to reclaim
enough memory for the allocation, the following check can fail due to
a race with another concurrent allocation:

    if (mem_cgroup_margin(mem_over_limit) >= nr_pages)
        goto retry;

For regular pages the following condition will save us from triggering
the OOM:

   if (nr_reclaimed && nr_pages <= (1 << PAGE_ALLOC_COSTLY_ORDER))
       goto retry;

But for high-order allocation this condition will intentionally fail.
The reason behind is that we'll likely fall to regular pages anyway,
so it's ok and even preferred to return ENOMEM.

In this case the idea of raising MEMCG_OOM looks dubious.

Fix this by moving MEMCG_OOM raising to mem_cgroup_oom() after
allocation order check, so that the event won't be raised for high
order allocations. This change doesn't affect regular pages allocation
and charging.

Signed-off-by: Roman Gushchin <guro@...com>
Acked-by: David Rientjes <rientjes@...gle.com>
Cc: Johannes Weiner <hannes@...xchg.org>
Cc: Michal Hocko <mhocko@...nel.org>
Cc: Vladimir Davydov <vdavydov.dev@...il.com>
---
 mm/memcontrol.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index fcec9b39e2a3..103ca3c31c04 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1669,6 +1669,8 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
 	if (order > PAGE_ALLOC_COSTLY_ORDER)
 		return OOM_SKIPPED;

+	memcg_memory_event(memcg, MEMCG_OOM);
+
 	/*
 	 * We are in the middle of the charge context here, so we
 	 * don't want to block when potentially sitting on a callstack
@@ -2250,8 +2252,6 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	if (fatal_signal_pending(current))
 		goto force;

-	memcg_memory_event(mem_over_limit, MEMCG_OOM);
-
 	/*
 	 * keep retrying as long as the memcg oom killer is able to make
 	 * a forward progress or bypass the charge if the oom killer
-- 
2.17.1