lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131016022606.GD4446@dastard>
Date:	Wed, 16 Oct 2013 13:26:06 +1100
From:	Dave Chinner <david@...morbit.com>
To:	Rik van Riel <riel@...hat.com>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Andi Kleen <andi@...stfloor.org>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Greg Thelen <gthelen@...gle.com>,
	Christoph Hellwig <hch@...radead.org>,
	Hugh Dickins <hughd@...gle.com>, Jan Kara <jack@...e.cz>,
	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Mel Gorman <mgorman@...e.de>,
	Minchan Kim <minchan.kim@...il.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Michel Lespinasse <walken@...gle.com>,
	Seth Jennings <sjenning@...ux.vnet.ibm.com>,
	Roman Gushchin <klamm@...dex-team.ru>,
	Ozgun Erdogan <ozgun@...usdata.com>,
	Metin Doslu <metin@...usdata.com>,
	Vlastimil Babka <vbabka@...e.cz>, Tejun Heo <tj@...nel.org>,
	linux-mm@...ck.org, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [patch 0/8] mm: thrash detection-based file cache sizing v5

On Tue, Oct 15, 2013 at 10:05:26PM -0400, Rik van Riel wrote:
> On 10/15/2013 07:41 PM, Dave Chinner wrote:
> > On Tue, Oct 15, 2013 at 01:41:28PM -0400, Johannes Weiner wrote:
> 
> >> I'm not forgetting about them, I just track them very coarsely by
> >> linking up address spaces and then lazily enforce their upper limit
> >> when memory is tight by using the shrinker callback.  The assumption
> >> was that actually scanning them is such a rare event that we trade the
> >> rare computational costs for smaller memory consumption most of the
> >> time.
> > 
> > Sure, I understand the tradeoff that you made. But there's nothing
> > worse than a system that slows down unpredictably because of some
> > magic threshold in some subsystem has been crossed and
> > computationally expensive operations kick in.
> 
> The shadow shrinker should remove the radix nodes with
> the oldest shadow entries first, so true LRU should actually
> work for the radix tree nodes.
> 
> Actually, since we only care about the age of the youngest
> shadow entry in each radix tree node, FIFO will be the same
> as LRU for that list.
> 
> That means the shrinker can always just take the radix tree
> nodes off the end.

Right, but it can't necessarily free the node as it may still have
pointers to pages in it. In that case, it would have to simply
rotate the page to the end of the LRU again.

Unless, of course, we kept track of the number of exceptional
entries in a node and didn't add it to the reclaim list until there
were no non-expceptional entries in the node....

> >> But it
> >> looks like tracking radix tree nodes with a list and backpointers to
> >> the mapping object for the lock etc. will be a major pain in the ass.
> > 
> > Perhaps so - it may not work out when we get down to the fine
> > details...
> 
> I suspect that a combination of lifetime rules (inode cannot
> disappear until all the radix tree nodes) and using RCU free
> for the radix tree nodes, and the inodes might do the trick.
> 
> That would mean that, while holding the rcu read lock, the
> back pointer from a radix tree node to the inode will always
> point to valid memory.

Yes, that is what I was thinking...

> That allows the shrinker to lock the inode, and verify that
> the inode is still valid, before it attempts to rcu free the
> radix tree node with shadow entries.

Lock the mapping, not the inode. The radix tree is protected by the
mapping_lock, not an inode lock. i.e. I'd hope that this can all b
contained within the struct address_space and not require any
knowledge of inodes or inode lifecycles at all.

> It also means that locking only needs to be in the inode,
> and on the LRU list for shadow radix tree nodes.
> 
> Does that sound sane?
> 
> Am I overlooking something?

It's pretty much along the same lines of what I was thinking, but
lets see what Johannes thinks.

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ