[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <200910122244.19666.borntraeger@de.ibm.com>
Date: Mon, 12 Oct 2009 22:44:19 +0200
From: Christian Borntraeger <borntraeger@...ibm.com>
To: Wu Fengguang <fengguang.wu@...el.com>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
Andrew Morton <akpm@...ux-foundation.org>,
Elladan <elladan@...imo.com>, Nick Piggin <npiggin@...e.de>,
Andi Kleen <andi@...stfloor.org>,
Christoph Lameter <cl@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Johannes Weiner <hannes@...xchg.org>,
Minchan Kim <minchan.kim@...il.com>
Subject: oomkiller over-ambitious after "vmscan: make mapped executable pages the first class citizen" (bisected)
I have seen some OOM-killer action on my s390x system when using large amounts
of anonymous memory:
[cborntra@...lp34 ~]$ cat memeat.c
#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
int main()
{
char *start;
char *a;
start = mmap(NULL, 4300000000UL,
PROT_READ | PROT_WRITE,
MAP_SHARED | MAP_ANONYMOUS, -1 , 0);
if (start == MAP_FAILED) {
printf("cannot map guest memory\n");
exit (1);
}
for (a = start; a < start + 4300000000UL; a += 4096)
*a='a';
exit(0);
}
[cborntra@...lp34 ~]$ ./memeat
Connection to t63lp34 closed.
I attached the dmesg with the oom messages.
As you can see we are failing several order 0 allocations with gfpmask=0x201da.
The application uses slightly more memory than is available. The thing is, that
there is plenty of swap space to fullfill the (non-atomic) request:
[cborntra@...lp34 ~]$ free
total used free shared buffers cached
Mem: 4166560 127148 4039412 0 2256 19752
-/+ buffers/cache: 105140 4061420
Swap: 9615904 8328 9607576
Since old kernels never showed OOM, I was able to bisect the first kernel that
shows this behaviour:
commit 8cab4754d24a0f2e05920170c845bd84472814c6
Author: Wu Fengguang <fengguang.wu@...el.com>
vmscan: make mapped executable pages the first class citizen
In fact, applying this patch makes the problem go away:
--- linux-2.6.orig/mm/vmscan.c
+++ linux-2.6/mm/vmscan.c
@@ -1345,22 +1345,8 @@ static void shrink_active_list(unsigned
/* page_referenced clears PageReferenced */
if (page_mapping_inuse(page) &&
- page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
+ page_referenced(page, 0, sc->mem_cgroup, &vm_flags))
nr_rotated++;
- /*
- * Identify referenced, file-backed active pages and
- * give them one more trip around the active list. So
- * that executable code get better chances to stay in
- * memory under moderate memory pressure. Anon pages
- * are not likely to be evicted by use-once streaming
- * IO, plus JVM can create lots of anon VM_EXEC pages,
- * so we ignore them here.
- */
- if ((vm_flags & VM_EXEC) && !PageAnon(page)) {
- list_add(&page->lru, &l_active);
- continue;
- }
- }
ClearPageActive(page); /* we are de-activating */
list_add(&page->lru, &l_inactive);
the interesting part is, that s390x in the default configuration has no no-
execute feature, resulting in the following map
c0000000-1c04cd000 rwxs 00000000 00:04 18517 /dev/zero (deleted)
As you can see, this area looks file mapped (/dev/zero) and executable. On the
other hand, the !PageAnon clause should cover this case. I am lost.
Does anybody on the CC (taken from the original patch) has an idea what the
problem is and how to fix this properly?
Christian
View attachment "dmesg.txt" of type "text/plain" (20462 bytes)
Powered by blists - more mailing lists