lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [day] [month] [year] [list]
Message-ID: <20120730134914.GE12680@tiehlicka.suse.cz>
Date:	Mon, 30 Jul 2012 15:49:14 +0200
From:	Michal Hocko <mhocko@...e.cz>
To:	akpm@...ux-foundation.org
Cc:	mm-commits@...r.kernel.org, handai.szj@...bao.com,
	gthelen@...gle.com, hannes@...xchg.org,
	kamezawa.hiroyu@...fujitsu.com, rientjes@...gle.com,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: + memcg-oom-provide-more-info-while-memcg-oom-happening.patch
 added to -mm tree

Hi and sorry for the late reply.

On Tue 24-07-12 13:33:03, Andrew Morton wrote:
> 
> The patch titled
>      Subject: memcg, oom: provide more info while memcg oom happening
> has been added to the -mm tree.  Its filename is
>      memcg-oom-provide-more-info-while-memcg-oom-happening.patch
> 
> Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
> 
> *** Remember to use Documentation/SubmitChecklist when testing your code ***
> 
> The -mm tree is included into linux-next and is updated
> there every 3-4 working days
> 
> ------------------------------------------------------
> From: Sha Zhengju <handai.szj@...bao.com>
> Subject: memcg, oom: provide more info while memcg oom happening
> 
> When an memcg oom is happening the current memcg related dump information
> is limited for debugging.  Provide more detailed memcg page statistics
> together with the total one while hierarchy is enabled.

I do agree that we need to print something more useful for memcg-oom
(who cares about the global state when the problem is per-memcg) but
this doesn't seem to be the best way to address it because it just adds
more to the OOM output and it doesn't prevent the global state being
printed out.
Please note that memcg oom can happen much more often than the global
oom so we should print only the important information.
Some more questions/notes below.

In short, I think the patch should be dropped.

> Signed-off-by: Sha Zhengju <handai.szj@...bao.com>
> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
> Cc: Michal Hocko <mhocko@...e.cz>
> Cc: Greg Thelen <gthelen@...gle.com>
> Cc: Johannes Weiner <hannes@...xchg.org>
> Cc: David Rientjes <rientjes@...gle.com>
> Signed-off-by: Andrew Morton <akpm@...ux-foundation.org>
> ---
> 
>  mm/memcontrol.c |   71 ++++++++++++++++++++++++++++++++++++++++------
>  1 file changed, 63 insertions(+), 8 deletions(-)
> 
> diff -puN mm/memcontrol.c~memcg-oom-provide-more-info-while-memcg-oom-happening mm/memcontrol.c
> --- a/mm/memcontrol.c~memcg-oom-provide-more-info-while-memcg-oom-happening
> +++ a/mm/memcontrol.c
> @@ -113,6 +113,14 @@ static const char * const mem_cgroup_eve
>  	"pgmajfault",
>  };
>  
> +static const char * const mem_cgroup_lru_names[] = {
> +	"inactive_anon",
> +	"active_anon",
> +	"inactive_file",
> +	"active_file",
> +	"unevictable",
> +};
> +
>  /*
>   * Per memcg event counter is incremented at every pagein/pageout. With THP,
>   * it will be incremated by the number of pages. This counter is used for
> @@ -1372,6 +1380,59 @@ static void move_unlock_mem_cgroup(struc
>  	spin_unlock_irqrestore(&memcg->move_lock, *flags);
>  }
>  
> +#define K(x) ((x) << (PAGE_SHIFT-10))
> +static void mem_cgroup_print_oom_stat(struct mem_cgroup *memcg)
> +{
> +	int i;
> +	struct mem_cgroup *mi;

We tend to call this iter in the context you are using it. The
declaration can be moved to the appropriate for loop as well.

> +
> +	printk(KERN_INFO "Memory cgroup stat:\n");
> +	for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
> +		if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
> +			continue;
> +		printk(KERN_CONT "%s:%ldKB ", mem_cgroup_stat_names[i],
> +			   K(mem_cgroup_read_stat(memcg, i)));
> +	}
> +
> +	for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++)
> +		printk(KERN_CONT "%s:%lu ", mem_cgroup_events_names[i],
> +			   mem_cgroup_read_events(memcg, i));
> +
> +	for (i = 0; i < NR_LRU_LISTS; i++)
> +		printk(KERN_CONT "%s:%luKB ", mem_cgroup_lru_names[i],
> +			   K(mem_cgroup_nr_lru_pages(memcg, BIT(i))));

Does it really make sense to print this separately? Which additional
information does this give to you?
OOM acts on a hierarchy in general (with a single tree node if
use_hierarchy==0) so showing the root statistics doesn't sound like very
much useful to me.

> +
> +	/* Dump the total statistics if hierarchy is enabled. */
> +	for (i = 0; i < MEM_CGROUP_STAT_NSTATS; i++) {
> +		long long val = 0;
> +
> +		if (i == MEM_CGROUP_STAT_SWAP && !do_swap_account)
> +			continue;
> +		for_each_mem_cgroup_tree(mi, memcg)
> +			val += mem_cgroup_read_stat(mi, i);
> +		printk(KERN_CONT "total_%s:%lldKB ", mem_cgroup_stat_names[i], K(val));
> +	}
> +
> +	for (i = 0; i < MEM_CGROUP_EVENTS_NSTATS; i++) {
> +		unsigned long long val = 0;
> +
> +		for_each_mem_cgroup_tree(mi, memcg)
> +			val += mem_cgroup_read_events(mi, i);
> +		printk(KERN_CONT "total_%s:%llu ", mem_cgroup_events_names[i], val);
> +	}
> +
> +	for (i = 0; i < NR_LRU_LISTS; i++) {
> +		unsigned long long val = 0;
> +
> +		for_each_mem_cgroup_tree(mi, memcg)
> +			val += mem_cgroup_nr_lru_pages(mi, BIT(i));
> +		printk(KERN_CONT "total_%s:%lluKB ", mem_cgroup_lru_names[i], K(val));
> +	}
> +
> +	printk(KERN_CONT "\n");
> +
> +}
> +
>  /**
>   * mem_cgroup_print_oom_info: Called from OOM with tasklist_lock held in read mode.
>   * @memcg: The memory cgroup that went over limit
> @@ -1436,6 +1497,8 @@ done:
>  		res_counter_read_u64(&memcg->memsw, RES_USAGE) >> 10,
>  		res_counter_read_u64(&memcg->memsw, RES_LIMIT) >> 10,
>  		res_counter_read_u64(&memcg->memsw, RES_FAILCNT));
> +
> +	mem_cgroup_print_oom_stat(memcg);
>  }
>  
>  /*
> @@ -4129,14 +4192,6 @@ static int memcg_numa_stat_show(struct c
>  }
>  #endif /* CONFIG_NUMA */
>  
> -static const char * const mem_cgroup_lru_names[] = {
> -	"inactive_anon",
> -	"active_anon",
> -	"inactive_file",
> -	"active_file",
> -	"unevictable",
> -};
> -
>  static inline void mem_cgroup_lru_names_not_uptodate(void)
>  {
>  	BUILD_BUG_ON(ARRAY_SIZE(mem_cgroup_lru_names) != NR_LRU_LISTS);
> _
> Subject: Subject: memcg, oom: provide more info while memcg oom happening
> 
> Patches currently in -mm which might be from handai.szj@...bao.com are
> 
> mm-oom-introduce-helper-function-to-process-threads-during-scan.patch
> mm-memcg-introduce-own-oom-handler-to-iterate-only-over-its-own-threads.patch
> memcg-oom-provide-more-info-while-memcg-oom-happening.patch
> memcg-oom-clarify-some-oom-dump-messages.patch
> 

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ