lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 09 Nov 2012 18:23:07 +0800 From: Sha Zhengju <handai.szj@...il.com> To: Michal Hocko <mhocko@...e.cz> CC: Sha Zhengju <handai.szj@...il.com>, linux-mm@...ck.org, cgroups@...r.kernel.org, kamezawa.hiroyu@...fujitsu.com, akpm@...ux-foundation.org, rientjes@...gle.com, linux-kernel@...r.kernel.org Subject: Re: [PATCH V3] memcg, oom: provide more precise dump info while memcg oom happening On 11/09/2012 12:25 AM, Michal Hocko wrote: > On Thu 08-11-12 23:52:47, Sha Zhengju wrote: > [...] >> (2) After change >> [ 269.225628] mal invoked oom-killer: gfp_mask=0xd0, order=0, oom_score_adj=0 >> [ 269.225633] mal cpuset=/ mems_allowed=0-1 >> [ 269.225636] Pid: 4616, comm: mal Not tainted 3.6.0+ #25 >> [ 269.225637] Call Trace: >> [ 269.225647] [<ffffffff8111b9c4>] dump_header+0x84/0xd0 >> [ 269.225650] [<ffffffff8111c691>] oom_kill_process+0x331/0x350 >> [ 269.225710] .......(call trace) >> [ 269.225713] [<ffffffff81517325>] page_fault+0x25/0x30 >> [ 269.225716] Task in /1/2 killed as a result of limit of /1 >> [ 269.225718] memory: usage 511732kB, limit 512000kB, failcnt 5071 >> [ 269.225720] memory+swap: usage 563200kB, limit 563200kB, failcnt 57 >> [ 269.225721] kmem: usage 0kB, limit 9007199254740991kB, failcnt 0 >> [ 269.225722] Memory cgroup stats:cache:8KB rss:511724KB mapped_file:4KB swap:51468KB inactive_anon:265864KB active_anon:245832KB inactive_file:0KB active_file:0KB unevictable:0KB >> [ 269.225741] [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name >> [ 269.225757] [ 4554] 0 4554 16626 473 17 25 0 bash >> [ 269.225759] [ 4611] 0 4611 103328 90231 208 12260 0 mal >> [ 269.225762] [ 4616] 0 4616 103328 32799 88 7562 0 mal >> [ 269.225764] Memory cgroup out of memory: Kill process 4611 (mal) score 699 or sacrifice child >> [ 269.225766] Killed process 4611 (mal) total-vm:413312kB, anon-rss:360632kB, file-rss:292kB >> >> This version provides more pointed info for memcg in "Memory cgroup stats" section. > Looks much better! > >> Change log: >> v3<--- v2 >> 1. fix towards hierarchy >> 2. undo rework dump_tasks >> v2<--- v1 >> 1. some modification towards hierarchy >> 2. rework dump_tasks >> 3. rebased on Michal's mm tree since-3.6 >> >> Signed-off-by: Sha Zhengju<handai.szj@...bao.com> >> --- >> mm/memcontrol.c | 41 +++++++++++++++++++++++++++++++---------- >> mm/oom_kill.c | 6 ++++-- >> 2 files changed, 35 insertions(+), 12 deletions(-) >> >> diff --git a/mm/memcontrol.c b/mm/memcontrol.c >> index 0eab7d5..17317fa 100644 >> --- a/mm/memcontrol.c >> +++ b/mm/memcontrol.c > [...] >> @@ -1501,8 +1509,8 @@ static void move_unlock_mem_cgroup(struct mem_cgroup *memcg, >> spin_unlock_irqrestore(&memcg->move_lock, *flags); >> } >> >> +#define K(x) ((x)<< (PAGE_SHIFT-10)) >> /** >> - * mem_cgroup_print_oom_info: Called from OOM with tasklist_lock held in read mode. > No need to remove this just fix it: > * mem_cgroup_print_oom_info: Print OOM information relevant to memory controller. > >> * @memcg: The memory cgroup that went over limit >> * @p: Task that is going to be killed >> * >> @@ -1520,8 +1528,10 @@ void mem_cgroup_print_oom_info(struct mem_cgroup *memcg, struct task_struct *p) >> */ >> static char memcg_name[PATH_MAX]; >> int ret; >> + struct mem_cgroup *mi; >> + unsigned int i; >> >> - if (!memcg || !p) >> + if (!p) >> return; >> >> rcu_read_lock(); >> @@ -1569,6 +1579,25 @@ done: >> res_counter_read_u64(&memcg->kmem, RES_USAGE)>> 10, >> res_counter_read_u64(&memcg->kmem, RES_LIMIT)>> 10, >> res_counter_read_u64(&memcg->kmem, RES_FAILCNT)); >> + >> + printk(KERN_INFO "Memory cgroup stats:"); > "Memory cgroup hierarchy stats" is probably a better fit with the > current implementation. > >> + for (i = 0; i< MEM_CGROUP_STAT_NSTATS; i++) { >> + long long val = 0; >> + if (i == MEM_CGROUP_STAT_SWAP&& !do_swap_account) >> + continue; >> + for_each_mem_cgroup_tree(mi, memcg) >> + val += mem_cgroup_read_stat(mi, i); >> + printk(KERN_CONT "%s:%lldKB ", mem_cgroup_stat_names[i], K(val)); >> + } >> + >> + for (i = 0; i< NR_LRU_LISTS; i++) { >> + unsigned long long val = 0; >> + >> + for_each_mem_cgroup_tree(mi, memcg) >> + val += mem_cgroup_nr_lru_pages(mi, BIT(i)); >> + printk(KERN_CONT "%s:%lluKB ", mem_cgroup_lru_names[i], K(val)); >> + } >> + printk(KERN_CONT "\n"); > This is nice and simple I am just thinking whether it is enough. Say > that you have a deeper hierarchy and the there is a safety limit in the > its root > A (limit) > /|\ > B C D > |\ > E F > > and we trigger an OOM on the A's limit. Now we know that something blew > up but what it was we do not know. Wouldn't it be better to swap the for > and for_each_mem_cgroup_tree loops? Then we would see the whole > hierarchy and can potentially point at the group which doesn't behave. > Memory cgroup stats for A/: ... > Memory cgroup stats for A/B/: ... > Memory cgroup stats for A/C/: ... > Memory cgroup stats for A/D/: ... > Memory cgroup stats for A/D/E/: ... > Memory cgroup stats for A/D/F/: ... > > Would it still fit in with your use case? > [...] We haven't used those complicate hierarchy yet, but it sounds a good suggestion. :) Hierarchy is a little complex to use from our experience, and the three cgroups involved in memcg oom can be different: memcg of invoker, killed task, memcg of going over limit.Suppose a process in B triggers oom and a victim in root A is selected to be killed, we may as well want to know memcg stats just local in A cgroup(excludes BCD). So besides hierarchy info, does it acceptable to also print the local root node stats which as I did in the V1 version(https://lkml.org/lkml/2012/7/30/179). Another one I'm hesitating is numa stats, it seems the output is beginning to get more and more.... Thanks, Sha -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@...r.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists