linux-kernel - Re: [PATCH 13/14] mm: memcontrol: account socket memory in unified hierarchy memory controller

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20151120192506.GD5623@cmpxchg.org>
Date:	Fri, 20 Nov 2015 14:25:06 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	Vladimir Davydov <vdavydov@...tuozzo.com>
Cc:	David Miller <davem@...emloft.net>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Tejun Heo <tj@...nel.org>, Michal Hocko <mhocko@...e.cz>,
	netdev@...r.kernel.org, linux-mm@...ck.org,
	cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
	kernel-team@...com
Subject: Re: [PATCH 13/14] mm: memcontrol: account socket memory in unified
 hierarchy memory controller

On Fri, Nov 20, 2015 at 04:10:33PM +0300, Vladimir Davydov wrote:
> On Thu, Nov 12, 2015 at 06:41:32PM -0500, Johannes Weiner wrote:
> ...
> > @@ -5514,16 +5550,43 @@ void sock_release_memcg(struct sock *sk)
> >   */
> >  bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
> >  {
> > +	unsigned int batch = max(CHARGE_BATCH, nr_pages);
> >  	struct page_counter *counter;
> > +	bool force = false;
> >  
> > -	if (page_counter_try_charge(&memcg->tcp_mem.memory_allocated,
> > -				    nr_pages, &counter)) {
> > -		memcg->tcp_mem.memory_pressure = 0;
> > +#ifdef CONFIG_MEMCG_KMEM
> > +	if (!cgroup_subsys_on_dfl(memory_cgrp_subsys)) {
> > +		if (page_counter_try_charge(&memcg->tcp_mem.memory_allocated,
> > +					    nr_pages, &counter)) {
> > +			memcg->tcp_mem.memory_pressure = 0;
> > +			return true;
> > +		}
> > +		page_counter_charge(&memcg->tcp_mem.memory_allocated, nr_pages);
> > +		memcg->tcp_mem.memory_pressure = 1;
> > +		return false;
> > +	}
> > +#endif
> > +	if (consume_stock(memcg, nr_pages))
> >  		return true;
> > +retry:
> > +	if (page_counter_try_charge(&memcg->memory, batch, &counter))
> > +		goto done;
> > +
> > +	if (batch > nr_pages) {
> > +		batch = nr_pages;
> > +		goto retry;
> >  	}
> > -	page_counter_charge(&memcg->tcp_mem.memory_allocated, nr_pages);
> > -	memcg->tcp_mem.memory_pressure = 1;
> > -	return false;
> > +
> > +	page_counter_charge(&memcg->memory, batch);
> > +	force = true;
> > +done:
> 
> > +	css_get_many(&memcg->css, batch);
> 
> Is there any point to get css reference per each charged page? For kmem
> it is absolutely necessary, because dangling slabs must block
> destruction of memcg's kmem caches, which are destroyed on css_free. But
> for sockets there's no such problem: memcg will be destroyed only after
> all sockets are destroyed and therefore uncharged (since
> sock_update_memcg pins css).

I'm afraid we have to when we want to share 'stock' with cache and
anon pages, which hold individual references. drain_stock() always
assumes one reference per cached page.

> > +	if (batch > nr_pages)
> > +		refill_stock(memcg, batch - nr_pages);
> > +
> > +	schedule_work(&memcg->socket_work);
> 
> I think it's suboptimal to schedule the work even if we are below the
> high threshold.

Hm, it seemed unnecessary to duplicate the hierarchy check since this
is in the batch-exhausted slowpath anyway.

> BTW why do we need this work at all? Why is reclaim_high called from
> task_work not enough?

The problem lies in the memcg association: the random task that gets
interrupted by an arriving packet might not be in the same memcg as
the one owning receiving socket. And multiple interrupts could happen
while we're in the kernel already charging pages. We'd basically have
to maintain a list of memcgs that need to run reclaim_high associated
with current.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/