[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-id: <1444660139-30125-1-git-send-email-pintu.k@samsung.com>
Date: Mon, 12 Oct 2015 19:58:59 +0530
From: Pintu Kumar <pintu.k@...sung.com>
To: akpm@...ux-foundation.org, minchan@...nel.org, dave@...olabs.net,
pintu.k@...sung.com, mhocko@...e.cz, koct9i@...il.com,
rientjes@...gle.com, hannes@...xchg.org,
penguin-kernel@...ove.sakura.ne.jp, bywxiaobai@....com,
mgorman@...e.de, vbabka@...e.cz, js1304@...il.com,
kirill.shutemov@...ux.intel.com, alexander.h.duyck@...hat.com,
sasha.levin@...cle.com, cl@...ux.com, fengguang.wu@...el.com,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Cc: cpgs@...sung.com, pintu_agarwal@...oo.com, pintu.ping@...il.com,
vishnu.ps@...sung.com, rohit.kr@...sung.com,
c.rajkumar@...sung.com, sreenathd@...sung.com
Subject: [RESEND PATCH 1/1] mm: vmstat: Add OOM victims count in vmstat counter
This patch maintains the number of oom victims kill count in
/proc/vmstat.
Currently, we are dependent upon kernel logs when the kernel OOM occurs.
But kernel OOM can went passed unnoticed by the developer as it can
silently kill some background applications/services.
In some small embedded system, it might be possible that OOM is captured
in the logs but it was over-written due to ring-buffer.
Thus this interface can quickly help the user in analyzing, whether there
were any OOM kill happened in the past, or whether the system have ever
entered the oom kill stage till date.
Thus, it can be beneficial under following cases:
1. User can monitor kernel oom kill scenario without looking into the
kernel logs.
2. It can help in tuning the watermark level in the system.
3. It can help in tuning the low memory killer behavior in user space.
4. It can be helpful on a logless system or if klogd logging
(/var/log/messages) are disabled.
A snapshot of the result of 3 days of over night test is shown below:
System: ARM Cortex A7, 1GB RAM, 8GB EMMC
Linux: 3.10.xx
Category: reference smart phone device
Loglevel: 7
Conditions: Fully loaded, BT/WiFi/GPS ON
Tests: auto launching of ~30+ apps using test scripts, in a loop for
3 days.
At the end of tests, check:
$ cat /proc/vmstat
nr_oom_victims 6
As we noticed, there were around 6 oom kill victims.
The OOM is bad for any system. So, this counter can help in quickly
tuning the OOM behavior of the system, without depending on the logs.
Signed-off-by: Pintu Kumar <pintu.k@...sung.com>
---
V2: Removed oom_stall, Suggested By: Michal Hocko <mhocko@...nel.org>
Renamed oom_kill_count to nr_oom_victims,
Suggested By: Michal Hocko <mhocko@...nel.org>
Suggested By: Anshuman Khandual <khandual@...ux.vnet.ibm.com>
include/linux/vm_event_item.h | 1 +
mm/oom_kill.c | 2 ++
mm/page_alloc.c | 1 -
mm/vmstat.c | 1 +
4 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h
index 2b1cef8..dd2600d 100644
--- a/include/linux/vm_event_item.h
+++ b/include/linux/vm_event_item.h
@@ -57,6 +57,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT,
#ifdef CONFIG_HUGETLB_PAGE
HTLB_BUDDY_PGALLOC, HTLB_BUDDY_PGALLOC_FAIL,
#endif
+ NR_OOM_VICTIMS,
UNEVICTABLE_PGCULLED, /* culled to noreclaim list */
UNEVICTABLE_PGSCANNED, /* scanned for reclaimability */
UNEVICTABLE_PGRESCUED, /* rescued from noreclaim list */
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 03b612b..802b8a1 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -570,6 +570,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
* space under its control.
*/
do_send_sig_info(SIGKILL, SEND_SIG_FORCED, victim, true);
+ count_vm_event(NR_OOM_VICTIMS);
mark_oom_victim(victim);
pr_err("Killed process %d (%s) total-vm:%lukB, anon-rss:%lukB, file-rss:%lukB\n",
task_pid_nr(victim), victim->comm, K(victim->mm->total_vm),
@@ -600,6 +601,7 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p,
task_pid_nr(p), p->comm);
task_unlock(p);
do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true);
+ count_vm_event(NR_OOM_VICTIMS);
}
rcu_read_unlock();
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 9bcfd70..fafb09d 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2761,7 +2761,6 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
schedule_timeout_uninterruptible(1);
return NULL;
}
-
/*
* Go through the zonelist yet one more time, keep very high watermark
* here, this is only to catch a parallel oom killing, we must fail if
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 1fd0886..8503a2e 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -808,6 +808,7 @@ const char * const vmstat_text[] = {
"htlb_buddy_alloc_success",
"htlb_buddy_alloc_fail",
#endif
+ "nr_oom_victims",
"unevictable_pgs_culled",
"unevictable_pgs_scanned",
"unevictable_pgs_rescued",
--
1.7.9.5
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists