linux-kernel - Re: [patch 0/9] mm: thrash detection-based file cache sizing v6

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20131126223046.GI22729@cmpxchg.org>
Date:	Tue, 26 Nov 2013 17:30:46 -0500
From:	Johannes Weiner <hannes@...xchg.org>
To:	Andrew Morton <akpm@...ux-foundation.org>
Cc:	Dave Chinner <david@...morbit.com>, Rik van Riel <riel@...hat.com>,
	Jan Kara <jack@...e.cz>, Vlastimil Babka <vbabka@...e.cz>,
	Peter Zijlstra <peterz@...radead.org>,
	Tejun Heo <tj@...nel.org>, Andi Kleen <andi@...stfloor.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Greg Thelen <gthelen@...gle.com>,
	Christoph Hellwig <hch@...radead.org>,
	Hugh Dickins <hughd@...gle.com>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Mel Gorman <mgorman@...e.de>,
	Minchan Kim <minchan.kim@...il.com>,
	Michel Lespinasse <walken@...gle.com>,
	Seth Jennings <sjenning@...ux.vnet.ibm.com>,
	Roman Gushchin <klamm@...dex-team.ru>,
	Ozgun Erdogan <ozgun@...usdata.com>,
	Metin Doslu <metin@...usdata.com>, linux-mm@...ck.org,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 0/9] mm: thrash detection-based file cache sizing v6

On Mon, Nov 25, 2013 at 04:57:29PM -0800, Andrew Morton wrote:
> On Sun, 24 Nov 2013 18:38:19 -0500 Johannes Weiner <hannes@...xchg.org> wrote:
> 
> > This series solves the problem by maintaining a history of pages
> > evicted from the inactive list, enabling the VM to detect frequently
> > used pages regardless of inactive list size and facilitate working set
> > transitions.
> 
> It's a very readable patchset - thanks for taking the time to do that.

Thanks.

> > 31 files changed, 1253 insertions(+), 401 deletions(-)
> 
> It's also a *ton* of stuff.  More code complexity, larger kernel data
> structures.  All to address a quite narrow class of workloads on a
> relatively small window of machine sizes.  How on earth do we decide
> whether it's worth doing?

The fileserver-type workload is not that unusual and not really
restricted to certain machine sizes.

But more importantly, these are reasonable workloads for which our
cache management fails completely, and we have no alternative solution
to offer.  What do we tell the people running these loads?

> Also, what's the memcg angle?  This is presently a global thing - do
> you think we're likely to want to make it per-memcg in the future?

Yes, it seemed easier to get the global case working first, but the
whole thing is designed with memcg in mind.  We can encode the unique
cgroup ID in the shadow entries as well and make the inactive_age per
lruvec instead of per-zone.

If space gets tight in the shadow entry (on 32 bit e.g.), instead of
counting every single eviction, we can group evictions into
generations of bigger chunks - the more memory, the less accurate the
refault distance has to be anyway - and can then get away with fewer
bits for the eviction timestamp.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/