linux-kernel - Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20180713231717.GX2234@dastard>
Date:   Sat, 14 Jul 2018 09:17:17 +1000
From:   Dave Chinner <david@...morbit.com>
To:     James Bottomley <James.Bottomley@...senPartnership.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Matthew Wilcox <willy@...radead.org>,
        Waiman Long <longman@...hat.com>,
        Michal Hocko <mhocko@...nel.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Jonathan Corbet <corbet@....net>,
        "Luis R. Rodriguez" <mcgrof@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        Jan Kara <jack@...e.cz>,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>,
        Miklos Szeredi <mszeredi@...hat.com>,
        Larry Woodman <lwoodman@...hat.com>,
        "Wangkai (Kevin,C)" <wangkai86@...wei.com>
Subject: Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries

On Fri, Jul 13, 2018 at 08:46:52AM -0700, James Bottomley wrote:
> On Fri, 2018-07-13 at 10:36 +1000, Dave Chinner wrote:
> > On Thu, Jul 12, 2018 at 12:57:15PM -0700, James Bottomley wrote:
> > > What surprises me most about this behaviour is the steadiness of
> > > the page cache ... I would have thought we'd have shrunk it
> > > somewhat given the intense call on the dcache.
> > 
> > Oh, good, the page cache vs superblock shrinker balancing still
> > protects the working set of each cache the way it's supposed to
> > under heavy single cache pressure. :)
> 
> Well, yes, but my expectation is most of the page cache is clean, so
> easily reclaimable.  I suppose part of my surprise is that I expected
> us to reclaim the clean caches first before we started pushing out the
> dirty stuff and reclaiming it.  I'm not saying it's a bad thing, just
> saying I didn't expect us to make such good decisions under the
> parameters of this test.

The clean caches are still turned over by the workload, but it is
very slow and only enough to eject old objects that have fallen out
of the working set. We've got a lot better at keeping the working
set in memory in adverse conditions over the past few years...

> > Keep in mind that the amount of work slab cache shrinkers perform is
> > directly proportional to the amount of page cache reclaim that is
> > performed and the size of the slab cache being reclaimed.  IOWs,
> > under a "single cache pressure" workload we should be directing
> > reclaim work to the huge cache creating the pressure and do very
> > little reclaim from other caches....
> 
> That definitely seems to happen.  The thing I was most surprised about
> is the steady pushing of anonymous objects to swap.  I agree the dentry
> cache doesn't seem to be growing hugely after the initial jump, so it
> seems to be the largest source of reclaim.

Which means swap behaviour has changed since I last looked at
reclaim balance several years ago. These sorts of dentry/inode loads
never used to push the system to swap. Not saying it's a bad thing,
just that it is different. :)

> > [ What follows from here is conjecture, but is based on what I've
> > seen in the past 10+ years on systems with large numbers of negative
> > dentries and fragmented dentry/inode caches. ]
> 
> OK, so I fully agree with the concern about pathological object vs page
> freeing problems (I referred to it previously).  However, I did think
> the compaction work that's been ongoing in mm was supposed to help
> here?

Compaction doesn't touch slab caches. We can't move active dentries
and other slab objects around in memory because they have external
objects with active references that point directly to them. Getting
exclusive access to active objects and all the things that point to
them from reclaim so we can move them is an intractable problem - it
has sunk slab cache defragmentation every time it has been attempted
in the past 15 years....

Cheers,

Dave.
-- 
Dave Chinner
david@...morbit.com