lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 16 Jul 2018 18:30:19 -0700
From:   Matthew Wilcox <willy@...radead.org>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     Michal Hocko <mhocko@...nel.org>,
        Dave Chinner <david@...morbit.com>,
        James Bottomley <James.Bottomley@...senPartnership.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Waiman Long <longman@...hat.com>,
        Al Viro <viro@...iv.linux.org.uk>,
        Jonathan Corbet <corbet@....net>,
        "Luis R. Rodriguez" <mcgrof@...nel.org>,
        Kees Cook <keescook@...omium.org>,
        Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
        linux-fsdevel <linux-fsdevel@...r.kernel.org>,
        linux-mm <linux-mm@...ck.org>,
        "open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
        Jan Kara <jack@...e.cz>,
        Paul McKenney <paulmck@...ux.vnet.ibm.com>,
        Ingo Molnar <mingo@...nel.org>,
        Miklos Szeredi <mszeredi@...hat.com>,
        Larry Woodman <lwoodman@...hat.com>,
        "Wangkai (Kevin,C)" <wangkai86@...wei.com>
Subject: Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries

On Mon, Jul 16, 2018 at 04:40:32PM -0700, Andrew Morton wrote:
> On Mon, 16 Jul 2018 05:41:15 -0700 Matthew Wilcox <willy@...radead.org> wrote:
> 
> > On Mon, Jul 16, 2018 at 11:09:01AM +0200, Michal Hocko wrote:
> > > On Fri 13-07-18 10:36:14, Dave Chinner wrote:
> > > [...]
> > > > By limiting the number of negative dentries in this case, internal
> > > > slab fragmentation is reduced such that reclaim cost never gets out
> > > > of control. While it appears to "fix" the symptoms, it doesn't
> > > > address the underlying problem. It is a partial solution at best but
> > > > at worst it's another opaque knob that nobody knows how or when to
> > > > tune.
> > > 
> > > Would it help to put all the negative dentries into its own slab cache?
> > 
> > Maybe the dcache should be more sensitive to its own needs.  In __d_alloc,
> > it could check whether there are a high proportion of negative dentries
> > and start recycling some existing negative dentries.
> 
> Well, yes.
> 
> The proposed patchset adds all this background reclaiming.  Problem is
> a) that background reclaiming sometimes can't keep up so a synchronous
> direct-reclaim was added on top and b) reclaiming dentries in the
> background will cause non-dentry-allocating tasks to suffer because of
> activity from the dentry-allocating tasks, which is inappropriate.

... and it's an awful lot of code (almost 600 lines!) to implement
something fairly conceptually simple.

> I expect a better design is something like
> 
> __d_alloc()
> {
> 	...
> 	while (too many dentries)
> 		call the dcache shrinker
> 	...
> }
> 
> and that's it.  This way we have a hard upper limit and only the tasks
> which are creating dentries suffer the cost.

I think the "too many total dentries" is probably handled just fine
by the core MM.  What the dentry cache needs to prevent is adding a
disproportionately large number of useless negative dentries.  

So I'd rather see:

	if (too_many_negative(nr_dentry, nr_dentry_neg))
		reclaim_negative_dentries(16);
	...

16 feels like a fairly natural batch size.  I don't know what
too_many_negative() looks like.  Maybe it's:

bool too_many_negative(unsigned int total, unsigned int neg)
{
	if (neg < 100)
		return false;
	if (neg * 5 < total * 2)
		return false;
	return true;
}

but it could be almost arbitrarily complex.  I do think it needs to
scale with the total number of dentries, not scale with memory size of
the machine or the number of CPUs or anything similar.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ