linux-kernel - Re: [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation (fixed coimpile bug)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20091008151710.1216a615.akpm@linux-foundation.org>
Date:	Thu, 8 Oct 2009 15:17:10 -0700
From:	Andrew Morton <akpm@...ux-foundation.org>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
Cc:	Hiroshi Shimamoto <h-shimamoto@...jp.nec.com>,
	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
	"balbir@...ux.vnet.ibm.com" <balbir@...ux.vnet.ibm.com>,
	"nishimura@....nes.nec.co.jp" <nis@...205.gate.nec.co.jp>
Subject: Re: [PATCH 1/2] memcg: coalescing uncharge at unmap and truncation
 (fixed coimpile bug)

On Fri, 2 Oct 2009 16:02:13 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com> wrote:

>
> ...
>
> In massive parallel enviroment, res_counter can be a performance bottleneck.
> One strong techinque to reduce lock contention is reducing calls by
> coalescing some amount of calls into one.
> 
> Considering charge/uncharge chatacteristic,
> 	- charge is done one by one via demand-paging.
> 	- uncharge is done by
> 		- in chunk at munmap, truncate, exit, execve...
> 		- one by one via vmscan/paging.
> 
> It seems we have a chance in uncharge at unmap/truncation.
> 
> This patch is a for coalescing uncharge. For avoiding scattering memcg's
> structure to functions under /mm, this patch adds memcg batch uncharge
> information to the task. 
> 
> The degree of coalescing depends on callers
>   - at invalidate/trucate... pagevec size
>   - at unmap ....ZAP_BLOCK_SIZE
> (memory itself will be freed in this degree.)
> Then, we'll not coalescing too much.
> 
>
> ...
>
> +static void
> +__do_uncharge(struct mem_cgroup *mem, const enum charge_type ctype)
> +{
> +	struct memcg_batch_info *batch = NULL;
> +	bool uncharge_memsw = true;
> +	/* If swapout, usage of swap doesn't decrease */
> +	if (!do_swap_account || ctype == MEM_CGROUP_CHARGE_TYPE_SWAPOUT)
> +		uncharge_memsw = false;
> +	/*
> +	 * do_batch > 0 when unmapping pages or inode invalidate/truncate.
> +	 * In those cases, all pages freed continously can be expected to be in
> +	 * the same cgroup and we have chance to coalesce uncharges.
> +	 * And, we do uncharge one by one if this is killed by OOM.
> +	 */
> +	if (!current->memcg_batch.do_batch || test_thread_flag(TIF_MEMDIE))
> +		goto direct_uncharge;
> +
> +	batch = &current->memcg_batch;
> +	/*
> +	 * In usual, we do css_get() when we remember memcg pointer.
> +	 * But in this case, we keep res->usage until end of a series of
> +	 * uncharges. Then, it's ok to ignore memcg's refcnt.
> +	 */
> +	if (!batch->memcg)
> +		batch->memcg = mem;
> +	/*
> +	 * In typical case, batch->memcg == mem. This means we can
> +	 * merge a series of uncharges to an uncharge of res_counter.
> +	 * If not, we uncharge res_counter ony by one.
> +	 */
> +	if (batch->memcg != mem)
> +		goto direct_uncharge;
> +	/* remember freed charge and uncharge it later */
> +	batch->pages += PAGE_SIZE;

->pages is really confusingly named.  It doesn't count pages, it counts
bytes!

We could call it `bytes', but perhaps charge_bytes would be more
communicative?

> +/*
> + * batch_start/batch_end is called in unmap_page_range/invlidate/trucate.
> + * In that cases, pages are freed continuously and we can expect pages
> + * are in the same memcg. All these calls itself limits the number of
> + * pages freed at once, then uncharge_start/end() is called properly.
> + */
> +
> +void mem_cgroup_uncharge_start(void)
> +{
> +	if (!current->memcg_batch.do_batch) {
> +		current->memcg_batch.memcg = NULL;
> +		current->memcg_batch.pages = 0;
> +		current->memcg_batch.memsw = 0;

what's memsw?

> +	}
> +	current->memcg_batch.do_batch++;
> +}
> +
>
> ...
>
> +#ifdef CONFIG_CGROUP_MEM_RES_CTLR /* memcg uses this to do batch job */
> +	struct memcg_batch_info {
> +		int do_batch;
> +		struct mem_cgroup *memcg;
> +		long pages, memsw;
> +	} memcg_batch;
> +#endif

I find the most valuable documetnation is that which is devoted to the
data structures.  This one didn't get any :(

Negative values of `pages' and `memsw' are meaningless, so it would be
better to give them an unsigned type.  That matches the
res_counter_charge() expectations also.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/