linux-kernel - Re: [PATCH 6/10] mm/memcg: take care over pc->mem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <alpine.LSU.2.00.1202211437140.2012@eggly.anvils>
Date:	Tue, 21 Feb 2012 15:03:03 -0800 (PST)
From:	Hugh Dickins <hughd@...gle.com>
To:	KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	Konstantin Khlebnikov <khlebnikov@...nvz.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Ying Han <yinghan@...gle.com>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 6/10] mm/memcg: take care over pc->mem_cgroup

On Tue, 21 Feb 2012, KAMEZAWA Hiroyuki wrote:
> On Mon, 20 Feb 2012 15:34:28 -0800 (PST)
> Hugh Dickins <hughd@...gle.com> wrote:
> 	return NULL;
> >  
> > +	lruvec = page_lock_lruvec(page);
> >  	lock_page_cgroup(pc);
> >  
> 
> Do we need to take lrulock+irq disable per page in this very very hot path ?

I'm sure we don't want to: I hope you were pleased to find it goes away
(from most cases) a couple of patches later.

I had lruvec lock nested inside page_cgroup lock in the rollup I sent in
December, whereas you went for page_cgroup lock nested inside lruvec lock
in your lrucare patch.

I couldn't find an imperative reason why they should be one way round or
the other, so I tried hard to stick with your ordering, and it did work
(in this 6/10).  But then I couldn't work out how to get rid of the
overheads added in doing it this way round, so swapped them back.

> 
> Hmm.... How about adding NR_ISOLATED counter into lruvec ?
> 
> Then, we can delay freeing lruvec until all conunters goes down to zero.
> as...
> 
> 	bool we_can_free_lruvec = true;
> 
> 	lock_lruvec(lruvec->lock);
> 	for_each_lru_lruvec(lru)
> 		if (!list_empty(&lruvec->lru[lru]))
> 			we_can_free_lruvec = false;
> 	if (lruvec->nr_isolated)
> 		we_can_free_lruvec = false;
> 	unlock_lruvec(lruvec)
> 	if (we_can_free_lruvec)
> 		kfree(lruvec);
> 
> If compaction, lumpy reclaim free a page taken from LRU,
> it knows what it does and can decrement lruvec->nr_isolated properly
> (it seems zone's NR_ISOLATED is decremented at putback.)

At the moment I'm thinking that what we end up with by 9/10 is
better than adding such a refcount.  But I'm not entirely happy with
mem_cgroup_reset_uncharged_to_root (it adds a further page_cgroup
lookup just after I got rid of some others), and need yet to think
about the race which Konstantin posits, so all options remain open.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/