[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20131126230010.GJ22729@cmpxchg.org>
Date: Tue, 26 Nov 2013 18:00:10 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Dave Chinner <david@...morbit.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Rik van Riel <riel@...hat.com>, Jan Kara <jack@...e.cz>,
Vlastimil Babka <vbabka@...e.cz>,
Peter Zijlstra <peterz@...radead.org>,
Tejun Heo <tj@...nel.org>, Andi Kleen <andi@...stfloor.org>,
Andrea Arcangeli <aarcange@...hat.com>,
Greg Thelen <gthelen@...gle.com>,
Christoph Hellwig <hch@...radead.org>,
Hugh Dickins <hughd@...gle.com>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
Mel Gorman <mgorman@...e.de>,
Minchan Kim <minchan.kim@...il.com>,
Michel Lespinasse <walken@...gle.com>,
Seth Jennings <sjenning@...ux.vnet.ibm.com>,
Roman Gushchin <klamm@...dex-team.ru>,
Ozgun Erdogan <ozgun@...usdata.com>,
Metin Doslu <metin@...usdata.com>, linux-mm@...ck.org,
linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [patch 9/9] mm: keep page cache radix tree nodes in check
On Wed, Nov 27, 2013 at 09:29:37AM +1100, Dave Chinner wrote:
> On Tue, Nov 26, 2013 at 04:27:25PM -0500, Johannes Weiner wrote:
> > On Tue, Nov 26, 2013 at 10:49:21AM +1100, Dave Chinner wrote:
> > > On Sun, Nov 24, 2013 at 06:38:28PM -0500, Johannes Weiner wrote:
> > > > Previously, page cache radix tree nodes were freed after reclaim
> > > > emptied out their page pointers. But now reclaim stores shadow
> > > > entries in their place, which are only reclaimed when the inodes
> > > > themselves are reclaimed. This is problematic for bigger files that
> > > > are still in use after they have a significant amount of their cache
> > > > reclaimed, without any of those pages actually refaulting. The shadow
> > > > entries will just sit there and waste memory. In the worst case, the
> > > > shadow entries will accumulate until the machine runs out of memory.
> ....
> > > ....
> > > > + radix_tree_replace_slot(slot, page);
> > > > + if (node) {
> > > > + node->count++;
> > > > + /* Installed page, can't be shadow-only anymore */
> > > > + if (!list_empty(&node->lru))
> > > > + list_lru_del(&workingset_shadow_nodes, &node->lru);
> > > > + }
> > > > + return 0;
> > >
> > > Hmmmmm - what's the overhead of direct management of LRU removal
> > > here? Most list_lru code uses lazy removal (i.e. via the shrinker)
> > > to avoid having to touch the LRU when adding new references to an
> > > object.....
> >
> > It's measurable in microbenchmarks, but not when any real IO is
> > involved. The difference was in the noise even on SSD drives.
>
> Well, it's not an SSD or two I'm worried about - it's devices that
> can do millions of IOPS where this is likely to be noticable...
>
> > The other list_lru users see items only once they become unused and
> > subsequent references are expected to be few and temporary, right?
>
> They go onto the list when the refcount falls to zero, but reuse can
> be frequent when being referenced repeatedly by a single user. That
> avoids every reuse from removing the object from the LRU then
> putting it back on the LRU for every reference cycle...
That's true, but it's less of a concern in the radix_tree_node case
because it takes a full inactive list cycle after a refault before the
node is put back on the LRU. Or a really unlikely placed partial node
truncation/invalidation (full truncation would just delete the whole
node anyway).
> > We expect pages to refault in spades on certain loads, at which point
> > we may have thousands of those nodes on the list that are no longer
> > reclaimable (10k nodes for about 2.5G of cache).
>
> Sure, look at the way the inode and dentry caches work - entire
> caches of millions of inodes and dentries often sit on the LRUs. A
> quick look at my workstations dentry cache shows:
>
> $ at /proc/sys/fs/dentry-state
> 180108 170596 45 0 0 0
>
> 180k allocated dentries, 170k sitting on the LRU...
Hm, and a significant amount of those 170k could rotate on the next
shrinker scan due to recent references or do you generally have
smaller spikes?
But as per above I think the case for lazily removing shadow nodes is
less convincing than for inodes and dentries.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists