[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20180716164032.94e13f765c5f33c6022eca38@linux-foundation.org>
Date: Mon, 16 Jul 2018 16:40:32 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Matthew Wilcox <willy@...radead.org>
Cc: Michal Hocko <mhocko@...nel.org>,
Dave Chinner <david@...morbit.com>,
James Bottomley <James.Bottomley@...senPartnership.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Waiman Long <longman@...hat.com>,
Al Viro <viro@...iv.linux.org.uk>,
Jonathan Corbet <corbet@....net>,
"Luis R. Rodriguez" <mcgrof@...nel.org>,
Kees Cook <keescook@...omium.org>,
Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-mm <linux-mm@...ck.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
Jan Kara <jack@...e.cz>,
Paul McKenney <paulmck@...ux.vnet.ibm.com>,
Ingo Molnar <mingo@...nel.org>,
Miklos Szeredi <mszeredi@...hat.com>,
Larry Woodman <lwoodman@...hat.com>,
"Wangkai (Kevin,C)" <wangkai86@...wei.com>
Subject: Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries
On Mon, 16 Jul 2018 05:41:15 -0700 Matthew Wilcox <willy@...radead.org> wrote:
> On Mon, Jul 16, 2018 at 11:09:01AM +0200, Michal Hocko wrote:
> > On Fri 13-07-18 10:36:14, Dave Chinner wrote:
> > [...]
> > > By limiting the number of negative dentries in this case, internal
> > > slab fragmentation is reduced such that reclaim cost never gets out
> > > of control. While it appears to "fix" the symptoms, it doesn't
> > > address the underlying problem. It is a partial solution at best but
> > > at worst it's another opaque knob that nobody knows how or when to
> > > tune.
> >
> > Would it help to put all the negative dentries into its own slab cache?
>
> Maybe the dcache should be more sensitive to its own needs. In __d_alloc,
> it could check whether there are a high proportion of negative dentries
> and start recycling some existing negative dentries.
Well, yes.
The proposed patchset adds all this background reclaiming. Problem is
a) that background reclaiming sometimes can't keep up so a synchronous
direct-reclaim was added on top and b) reclaiming dentries in the
background will cause non-dentry-allocating tasks to suffer because of
activity from the dentry-allocating tasks, which is inappropriate.
I expect a better design is something like
__d_alloc()
{
...
while (too many dentries)
call the dcache shrinker
...
}
and that's it. This way we have a hard upper limit and only the tasks
which are creating dentries suffer the cost.
Regarding the slab page fragmentation issue: I'm wondering if the whole
idea of balancing the slab scan rates against the page scan rates isn't
really working out. Maybe shrink_slab() should be sitting there
hammering the caches until they have freed up a particular number of
pages. Quite a big change, conceptually and implementationally.
Aside: about a billion years ago we were having issues with processes
getting stuck in direct reclaim because other processes were coming in
and stealing away the pages which the direct-reclaimer had just freed.
One possible solution to that was to make direct-reclaiming tasks
release the freed pages into a list on the task_struct. So those pages
were invisible to other allocating tasks and were available to the
direct-reclaimer when it returned from the reclaim effort. I forget
what happened to this.
It's quite a small code change and would provide a mechanism for
implementing the hammer-cache-until-youve-freed-enough design above.
Aside 2: if we *do* do something like the above __d_alloc() pseudo code
then perhaps it could be cast in terms of pages, not dentries. ie,
__d_alloc()
{
...
while (too many pages in dentry_cache)
call the dcache shrinker
...
}
and, apart from the external name thing (grr), that should address
these fragmentation issues, no? I assume it's easy to ask slab how
many pages are presently in use for a particular cache.
Powered by blists - more mailing lists