lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20140204160509.GN6963@cmpxchg.org>
Date:	Tue, 4 Feb 2014 11:05:09 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	Michal Hocko <mhocko@...e.cz>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	LKML <linux-kernel@...r.kernel.org>, linux-mm@...ck.org
Subject: Re: [PATCH -v2 2/6] memcg: cleanup charge routines

On Tue, Feb 04, 2014 at 02:28:56PM +0100, Michal Hocko wrote:
> The current core of memcg charging is wild to say the least.
> __mem_cgroup_try_charge which is in the center tries to be too clever
> and it handles two independent cases
> 	* when the memcg to be charged is known in advance
> 	* when the given mm_struct is charged
> The resulting callchains are quite complex:
> 
> memcg_charge_kmem(mm=NULL, memcg)  mem_cgroup_newpage_charge(mm)
>  |                                | _________________________________________ mem_cgroup_cache_charge(current->mm)
>  |                                |/                                            |
>  | ______________________________ mem_cgroup_charge_common(mm, memcg=NULL)      |
>  |/                                                                             /
>  |                                                                             /
>  | ____________________________ mem_cgroup_try_charge_swapin(mm, memcg=NULL)  /
>  |/                               | _________________________________________/
>  |                                |/
>  |                                |                         /* swap accounting */   /* no swap accounting */
>  | _____________________________  __mem_cgroup_try_charge_swapin(mm=NULL, memcg) || (mm, memcg=NULL)
>  |/
>  | ____________________________ mem_cgroup_do_precharge(mm=NULL, memcg)
>  |/
> __mem_cgroup_try_charge
>   mem_cgroup_do_charge
>     res_counter_charge
>     mem_cgroup_reclaim
>     mem_cgroup_wait_acct_move
>     mem_cgroup_oom
> 
> This patch splits __mem_cgroup_try_charge into two logical parts.
> mem_cgroup_try_charge_mm which is responsible for charges for the given
> mm_struct and it returns the charged memcg or NULL under OOM while
> mem_cgroup_try_charge_memcg charges a known memcg and returns an error
> code.
> 
> The only tricky part which remains is __mem_cgroup_try_charge_swapin
> because it can return 0 if PageCgroupUsed is already set and then we do
> not want to commit the charge. This is done with a magic combination of
> memcg = NULL and ret = 0. So the function preserves its memcgp parameter
> and sets the given memcg to NULL when it sees PageCgroupUsed
> (__mem_cgroup_commit_charge_swapin then ignores such a commit).
> 
> Not only the code is easier to follow the change reduces the code size
> too:
> size mm/built-in.o.before mm/built-in.o.after
>    text    data     bss     dec     hex filename
> 464718   83038   49904  597660   91e9c mm/built-in.o.before
> 463894   83038   49904  596836   91b64 mm/built-in.o.after
> 
> Signed-off-by: Michal Hocko <mhocko@...e.cz>
> ---
>  mm/memcontrol.c | 205 +++++++++++++++++++++++++++-----------------------------
>  1 file changed, 98 insertions(+), 107 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 042e4ff36c05..72fbe0fb3320 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2618,7 +2618,7 @@ static int memcg_cpu_hotplug_callback(struct notifier_block *nb,
>  }
>  
>  
> -/* See __mem_cgroup_try_charge() for details */
> +/* See mem_cgroup_do_charge() for details */
>  enum {
>  	CHARGE_OK,		/* success */
>  	CHARGE_RETRY,		/* need to retry but retry is not bad */
> @@ -2691,108 +2691,69 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
>  	return CHARGE_NOMEM;
>  }
>  
> +static bool current_bypass_charge(void)
> +{
> +	/*
> +	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> +	 * in system level. So, allow to go ahead dying process in addition to
> +	 * MEMDIE process.
> +	 */
> +	if (unlikely(test_thread_flag(TIF_MEMDIE)
> +		     || fatal_signal_pending(current)))
> +		return true;
> +
> +	return false;
> +}

I'd just leave it inline at this point, it lines up nicely with the
other pre-charge checks in try_charge, which is at this point short
enough to take this awkward 3-liner.

> +static int mem_cgroup_try_charge_memcg(gfp_t gfp_mask,
>  				   unsigned int nr_pages,
> -				   struct mem_cgroup **ptr,
> +				   struct mem_cgroup *memcg,
>  				   bool oom)
>  {
>  	unsigned int batch = max(CHARGE_BATCH, nr_pages);
>  	int nr_oom_retries = MEM_CGROUP_RECLAIM_RETRIES;
> -	struct mem_cgroup *memcg = NULL;
>  	int ret;
>  
> -	/*
> -	 * Unlike gloval-vm's OOM-kill, we're not in memory shortage
> -	 * in system level. So, allow to go ahead dying process in addition to
> -	 * MEMDIE process.
> -	 */
> -	if (unlikely(test_thread_flag(TIF_MEMDIE)
> -		     || fatal_signal_pending(current)))
> +	if (mem_cgroup_is_root(memcg) || current_bypass_charge())
>  		goto bypass;
>  
>  	if (unlikely(task_in_memcg_oom(current)))
>  		goto nomem;
>  
> +	if (consume_stock(memcg, nr_pages))
> +		return 0;
> +
>  	if (gfp_mask & __GFP_NOFAIL)
>  		oom = false;
>  
> -	/*
> -	 * We always charge the cgroup the mm_struct belongs to.
> -	 * The mm_struct's mem_cgroup changes on task migration if the
> -	 * thread group leader migrates. It's possible that mm is not
> -	 * set, if so charge the root memcg (happens for pagecache usage).
> -	 */
> -	if (!*ptr && !mm)
> -		*ptr = root_mem_cgroup;

[...]

>  /*
> + * Charges and returns memcg associated with the given mm (or root_mem_cgroup
> + * if mm is NULL). Returns NULL if memcg is under OOM.
> + */
> +static struct mem_cgroup *mem_cgroup_try_charge_mm(struct mm_struct *mm,
> +				   gfp_t gfp_mask,
> +				   unsigned int nr_pages,
> +				   bool oom)
> +{
> +	struct mem_cgroup *memcg;
> +	int ret;
> +
> +	/*
> +	 * We always charge the cgroup the mm_struct belongs to.
> +	 * The mm_struct's mem_cgroup changes on task migration if the
> +	 * thread group leader migrates. It's possible that mm is not
> +	 * set, if so charge the root memcg (happens for pagecache usage).
> +	 */
> +	if (!mm)
> +		goto bypass;

Why shuffle it around right before you remove it anyway?  Just start
the series off with the patches that delete stuff without having to
restructure anything, get those out of the way.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ