lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20151210142147.GP19496@dhcp22.suse.cz>
Date:	Thu, 10 Dec 2015 15:21:47 +0100
From:	Michal Hocko <mhocko@...nel.org>
To:	Johannes Weiner <hannes@...xchg.org>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Vladimir Davydov <vdavydov@...tuozzo.com>, linux-mm@...ck.org,
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
	kernel-team@...com
Subject: Re: [PATCH 7/8] mm: memcontrol: account "kmem" consumers in cgroup2
 memory controller

On Tue 08-12-15 13:34:24, Johannes Weiner wrote:
> The original cgroup memory controller has an extension to account slab
> memory (and other "kernel memory" consumers) in a separate "kmem"
> counter, once the user set an explicit limit on that "kmem" pool.
> 
> However, this includes various consumers whose sizes are directly
> linked to userspace activity. Accounting them as an optional "kmem"
> extension is problematic for several reasons:
> 
> 1. It leaves the main memory interface with incomplete semantics. A
>    user who puts their workload into a cgroup and configures a memory
>    limit does not expect us to leave holes in the containment as big
>    as the dentry and inode cache, or the kernel stack pages.
> 
> 2. If the limit set on this random historical subgroup of consumers is
>    reached, subsequent allocations will fail even when the main memory
>    pool available to the cgroup is not yet exhausted and/or has
>    reclaimable memory in it.
> 
> 3. Calling it 'kernel memory' is misleading. The dentry and inode
>    caches are no more 'kernel' (or no less 'user') memory than the
>    page cache itself. Treating these consumers as different classes is
>    a historical implementation detail that should not leak to users.
> 
> So, in addition to page cache, anonymous memory, and network socket
> memory, account the following memory consumers per default in the
> cgroup2 memory controller:
> 
>      - threadinfo
>      - task_struct
>      - task_delay_info
>      - pid
>      - cred
>      - mm_struct
>      - vm_area_struct and vm_region (nommu)
>      - anon_vma and anon_vma_chain
>      - signal_struct
>      - sighand_struct
>      - fs_struct
>      - files_struct
>      - fdtable and fdtable->full_fds_bits
>      - dentry and external_name
>      - inode for all filesystems.
> 
> This should give us reasonable memory isolation for most common
> workloads out of the box.
> 
> Signed-off-by: Johannes Weiner <hannes@...xchg.org>

Acked-by: Michal Hocko <mhocko@...e.com>

> ---
>  mm/memcontrol.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ab72c47..d048137 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2356,13 +2356,14 @@ int __memcg_kmem_charge_memcg(struct page *page, gfp_t gfp, int order,
>  	if (!memcg_kmem_online(memcg))
>  		return 0;
>  
> -	if (!page_counter_try_charge(&memcg->kmem, nr_pages, &counter))
> -		return -ENOMEM;
> -
>  	ret = try_charge(memcg, gfp, nr_pages);
> -	if (ret) {
> -		page_counter_uncharge(&memcg->kmem, nr_pages);
> +	if (ret)
>  		return ret;
> +
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys) &&
> +	    !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
> +		cancel_charge(memcg, nr_pages);
> +		return -ENOMEM;
>  	}
>  
>  	page->mem_cgroup = memcg;
> @@ -2391,7 +2392,9 @@ void __memcg_kmem_uncharge(struct page *page, int order)
>  
>  	VM_BUG_ON_PAGE(mem_cgroup_is_root(memcg), page);
>  
> -	page_counter_uncharge(&memcg->kmem, nr_pages);
> +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
> +		page_counter_uncharge(&memcg->kmem, nr_pages);
> +
>  	page_counter_uncharge(&memcg->memory, nr_pages);
>  	if (do_memsw_account())
>  		page_counter_uncharge(&memcg->memsw, nr_pages);
> @@ -2895,7 +2898,8 @@ static int memcg_propagate_kmem(struct mem_cgroup *memcg)
>  	 * onlined after this point, because it has at least one child
>  	 * already.
>  	 */
> -	if (memcg_kmem_online(parent))
> +	if (cgroup_subsys_on_dfl(memory_cgrp_subsys) ||
> +	    memcg_kmem_online(parent))
>  		ret = memcg_online_kmem(memcg);
>  	mutex_unlock(&memcg_limit_mutex);
>  	return ret;
> -- 
> 2.6.3

-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ