[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20150125043608.GB6109@wfg-t540p.sh.intel.com>
Date: Sat, 24 Jan 2015 20:36:08 -0800
From: Fengguang Wu <fengguang.wu@...el.com>
To: "Kirill A. Shutemov" <kirill.shutemov@...ux.intel.com>
Cc: LKP <lkp@...org>, linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: [mm] WARNING: CPU: 1 PID: 681 at mm/mmap.c:2858 exit_mmap()
Greetings,
0day kernel testing robot got the below dmesg and the first bad commit is
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git master
commit f7a7b53a90f7a489c4e435d1300db121f6b42776
Author: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
AuthorDate: Fri Jan 23 10:11:34 2015 +1100
Commit: Stephen Rothwell <sfr@...b.auug.org.au>
CommitDate: Fri Jan 23 10:11:34 2015 +1100
mm: account pmd page tables to the process
Dave noticed that unprivileged process can allocate significant amount of
memory -- >500 MiB on x86_64 -- and stay unnoticed by oom-killer and
memory cgroup. The trick is to allocate a lot of PMD page tables. Linux
kernel doesn't account PMD tables to the process, only PTE.
The use-cases below use few tricks to allocate a lot of PMD page tables
while keeping VmRSS and VmPTE low. oom_score for the process will be 0.
#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <sys/prctl.h>
#define PUD_SIZE (1UL << 30)
#define PMD_SIZE (1UL << 21)
#define NR_PUD 130000
int main(void)
{
char *addr = NULL;
unsigned long i;
prctl(PR_SET_THP_DISABLE);
for (i = 0; i < NR_PUD ; i++) {
addr = mmap(addr + PUD_SIZE, PUD_SIZE, PROT_WRITE|PROT_READ,
MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap");
break;
}
*addr = 'x';
munmap(addr, PMD_SIZE);
mmap(addr, PMD_SIZE, PROT_WRITE|PROT_READ,
MAP_ANONYMOUS|MAP_PRIVATE|MAP_FIXED, -1, 0);
if (addr == MAP_FAILED)
perror("re-mmap"), exit(1);
}
printf("PID %d consumed %lu KiB in PMD page tables\n",
getpid(), i * 4096 >> 10);
return pause();
}
The patch addresses the issue by account PMD tables to the process the
same way we account PTE.
The main place where PMD tables is accounted is __pmd_alloc() and
free_pmd_range(). But there're few corner cases:
- HugeTLB can share PMD page tables. The patch handles by accounting
the table to all processes who share it.
- x86 PAE pre-allocates few PMD tables on fork.
- Architectures with FIRST_USER_ADDRESS > 0. We need to adjust sanity
check on exit(2).
Accounting only happens on configuration where PMD page table's level is
present (PMD is not folded). As with nr_ptes we use per-mm counter. The
counter value is used to calculate baseline for badness score by
oom-killer.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@...ux.intel.com>
Reported-by: Dave Hansen <dave.hansen@...ux.intel.com>
Cc: Hugh Dickins <hughd@...gle.com>
Reviewed-by: Cyrill Gorcunov <gorcunov@...nvz.org>
Cc: Pavel Emelyanov <xemul@...nvz.org>
Cc: David Rientjes <rientjes@...gle.com>
Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
+-----------------------------------+------------+------------+---------------+
| | fe888c1f62 | f7a7b53a90 | next-20150123 |
+-----------------------------------+------------+------------+---------------+
| boot_successes | 1364 | 142 | 25 |
| boot_failures | 5 | 227 | 19 |
| BUG:kernel_test_crashed | 5 | | |
| WARNING:at_mm/mmap.c:#exit_mmap() | 0 | 227 | 19 |
| backtrace:do_execve | 0 | 227 | 19 |
| backtrace:SyS_execve | 0 | 227 | 19 |
| backtrace:do_group_exit | 0 | 227 | 19 |
| backtrace:SyS_exit_group | 0 | 227 | 19 |
| backtrace:do_execveat_common | 0 | 3 | |
| backtrace:do_exit | 0 | 5 | |
+-----------------------------------+------------+------------+---------------+
[ 17.687075] Freeing unused kernel memory: 1716K (c190d000 - c1aba000)
[ 17.808897] random: init urandom read with 5 bits of entropy available
[ 17.828360] ------------[ cut here ]------------
[ 17.828989] WARNING: CPU: 1 PID: 681 at mm/mmap.c:2858 exit_mmap+0x197/0x1ad()
[ 17.830086] Modules linked in:
[ 17.830549] CPU: 1 PID: 681 Comm: init Not tainted 3.19.0-rc5-gf7a7b53 #19
[ 17.831339] 00000001 00000000 00000001 d388bd4c c14341a1 00000000 00000001 c16ebf08
[ 17.832421] d388bd68 c1056987 00000b2a c1150db8 00000001 00000001 00000000 d388bd78
[ 17.833488] c1056a11 00000009 00000000 d388bdd0 c1150db8 d3858380 ffffffff ffffffff
[ 17.841323] Call Trace:
[ 17.844215] [<c14341a1>] dump_stack+0x78/0xa8
[ 17.844700] [<c1056987>] warn_slowpath_common+0xb7/0xce
[ 17.847797] [<c1150db8>] ? exit_mmap+0x197/0x1ad
[ 17.850955] [<c1056a11>] warn_slowpath_null+0x14/0x18
[ 17.854131] [<c1150db8>] exit_mmap+0x197/0x1ad
[ 17.854629] [<c10537ff>] mmput+0x52/0xef
[ 17.857584] [<c1175602>] flush_old_exec+0x923/0x99d
[ 17.860806] [<c11aea1e>] load_elf_binary+0x430/0x11af
[ 17.861378] [<c108559f>] ? local_clock+0x2f/0x39
[ 17.865327] [<c109817f>] ? lock_release_holdtime+0x60/0x6d
[ 17.866002] [<c1174159>] search_binary_handler+0x9c/0x20f
[ 17.866588] [<c11ac7e5>] load_script+0x339/0x355
[ 17.874149] [<c108550c>] ? sched_clock_cpu+0x188/0x1a3
[ 17.874718] [<c108559f>] ? local_clock+0x2f/0x39
[ 17.878580] [<c109817f>] ? lock_release_holdtime+0x60/0x6d
[ 17.879355] [<c109c1bf>] ? do_raw_read_unlock+0x28/0x53
[ 17.879997] [<c1174159>] search_binary_handler+0x9c/0x20f
[ 17.887644] [<c1176054>] do_execveat_common+0x6d6/0x954
[ 17.890904] [<c11762eb>] do_execve+0x19/0x1b
[ 17.891389] [<c1176586>] SyS_execve+0x21/0x25
[ 17.895168] [<c143be92>] syscall_call+0x7/0x7
[ 17.895653] ---[ end trace 6a7094e9a1d04ce0 ]---
[ 17.909585] ------------[ cut here ]------------
git bisect start de3d2c5b941c632685ab58613f981bf14a42676f ec6f34e5b552fb0a52e6aae1a5afbbb1605cc6cc --
git bisect good 505c8f8b41aaae2239941fc1c25bc8d4aa9188a6 # 08:42 369+ 1 Merge remote-tracking branch 'kbuild/for-next'
git bisect good 5cdfab738b22d402bc764e9f5f93824ff5f3800f # 08:46 369+ 0 Merge remote-tracking branch 'audit/next'
git bisect good 551aa38a4d27c7e71791ded0ee4a746abe954f9b # 08:53 369+ 0 Merge remote-tracking branch 'usb-gadget/next'
git bisect good bf26a22140410ca8fee8de8d74d9b69eeac450d1 # 08:58 369+ 3 Merge remote-tracking branch 'pwm/for-next'
git bisect good 522698e0cdb31f34ef897d463ddbe4d289a83b16 # 09:05 369+ 1 Merge remote-tracking branch 'y2038/y2038'
git bisect good 879b01ab025b80f0350b3181f2eb86f1a3deadc2 # 09:10 369+ 0 Merge remote-tracking branch 'livepatching/for-next'
git bisect bad d347062b744695e0490a53c199fac1a184870d29 # 09:10 0- 156 Merge branch 'akpm-current/current'
git bisect bad f7a7b53a90f7a489c4e435d1300db121f6b42776 # 09:34 0- 5 mm: account pmd page tables to the process
git bisect good 905d130bf8d5622c4dfa1667414993bb214d3a1e # 10:50 369+ 1 x86: drop _PAGE_FILE and pte_file()-related helpers
git bisect good daba3b6a1f18fc36eb6fe15eca008c3e658a8f72 # 11:39 369+ 1 mm: numa: add paranoid check around pte_protnone_numa
git bisect good 077ccc6a5a442a0460aba99085a6b84578a01faf # 12:21 369+ 2 memcg: add BUILD_BUG_ON() for string tables
git bisect good 76c365c2fe9bc89844dee698b7d3382faa9afc75 # 12:31 369+ 1 oom, PM: make OOM detection in the freezer path raceless
git bisect good 10c7667f091d0ab62b13d31f33bef469dc6683b4 # 13:27 369+ 2 fs: shrinker: always scan at least one object of each type
git bisect good 8aac135aaf196fd1a0b8f9c08d3514b64cefc4b3 # 13:47 369+ 1 mm: make FIRST_USER_ADDRESS unsigned long on all archs
git bisect good fe888c1f6277ea1b0d18dda12fff1dac4617905a # 14:05 369+ 1 arm: define __PAGETABLE_PMD_FOLDED for !LPAE
# first bad commit: [f7a7b53a90f7a489c4e435d1300db121f6b42776] mm: account pmd page tables to the process
git bisect good fe888c1f6277ea1b0d18dda12fff1dac4617905a # 14:26 1000+ 5 arm: define __PAGETABLE_PMD_FOLDED for !LPAE
# extra tests with DEBUG_INFO
git bisect good f7a7b53a90f7a489c4e435d1300db121f6b42776 # 14:46 1000+ 0 mm: account pmd page tables to the process
# extra tests on HEAD of next/master
git bisect bad de3d2c5b941c632685ab58613f981bf14a42676f # 14:46 0- 19 Add linux-next specific files for 20150123
# extra tests on tree/branch next/master
git bisect bad de3d2c5b941c632685ab58613f981bf14a42676f # 14:46 0- 19 Add linux-next specific files for 20150123
# extra tests on tree/branch linus/master
git bisect good c4e00f1d31c4c83d15162782491689229bd92527 # 16:42 1000+ 3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs
# extra tests on tree/branch next/master
git bisect bad de3d2c5b941c632685ab58613f981bf14a42676f # 16:43 0- 19 Add linux-next specific files for 20150123
This script may reproduce the error.
----------------------------------------------------------------------------
#!/bin/bash
kernel=$1
initrd=quantal-core-i386.cgz
wget --no-clobber https://github.com/fengguang/reproduce-kernel-bug/raw/master/initrd/$initrd
kvm=(
qemu-system-x86_64
-cpu kvm64
-enable-kvm
-kernel $kernel
-initrd $initrd
-m 320
-smp 2
-net nic,vlan=1,model=e1000
-net user,vlan=1
-boot order=nc
-no-reboot
-watchdog i6300esb
-rtc base=localtime
-serial stdio
-display none
-monitor null
)
append=(
hung_task_panic=1
earlyprintk=ttyS0,115200
debug
apic=debug
sysrq_always_enabled
rcupdate.rcu_cpu_stall_timeout=100
panic=-1
softlockup_panic=1
nmi_watchdog=panic
oops=panic
load_ramdisk=2
prompt_ramdisk=0
console=ttyS0,115200
console=tty0
vga=normal
root=/dev/ram0
rw
drbd.minor_count=8
)
"${kvm[@]}" --append "${append[*]}"
----------------------------------------------------------------------------
Thanks,
Fengguang
View attachment "dmesg-quantal-client9-17:20150124102932:i386-randconfig-x1-01141042:3.19.0-rc5-gf7a7b53:19" of type "text/plain" (399277 bytes)
View attachment "config-3.19.0-rc5-gf7a7b53" of type "text/plain" (74918 bytes)
_______________________________________________
LKP mailing list
LKP@...ux.intel.com
Powered by blists - more mailing lists