[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20200212164235.GB180867@cmpxchg.org>
Date: Wed, 12 Feb 2020 11:42:35 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Yafang Shao <laoar.shao@...il.com>
Cc: linux-fsdevel@...r.kernel.org, Linux MM <linux-mm@...ck.org>,
LKML <linux-kernel@...r.kernel.org>,
Dave Chinner <david@...morbit.com>,
Michal Hocko <mhocko@...e.com>, Roman Gushchin <guro@...com>,
Andrew Morton <akpm@...ux-foundation.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
Kernel Team <kernel-team@...com>
Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker
LRU
On Wed, Feb 12, 2020 at 08:25:45PM +0800, Yafang Shao wrote:
> On Wed, Feb 12, 2020 at 1:55 AM Johannes Weiner <hannes@...xchg.org> wrote:
> > Another variant of this problem was recently observed, where the
> > kernel violates cgroups' memory.low protection settings and reclaims
> > page cache way beyond the configured thresholds. It was followed by a
> > proposal of a modified form of the reverted commit above, that
> > implements memory.low-sensitive shrinker skipping over populated
> > inodes on the LRU [1]. However, this proposal continues to run the
> > risk of attracting disproportionate reclaim pressure to a pool of
> > still-used inodes,
>
> Hi Johannes,
>
> If you really think that is a risk, what about bellow additional patch
> to fix this risk ?
>
> diff --git a/fs/inode.c b/fs/inode.c
> index 80dddbc..61862d9 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -760,7 +760,7 @@ static bool memcg_can_reclaim_inode(struct inode *inode,
> goto out;
>
> cgroup_size = mem_cgroup_size(memcg);
> - if (inode->i_data.nrpages + protection >= cgroup_size)
> + if (inode->i_data.nrpages)
> reclaimable = false;
>
> out:
>
> With this additional patch, we skip all inodes in this memcg until all
> its page cache pages are reclaimed.
Well that's something we've tried and had to revert because it caused
issues in slab reclaim. See the History part of my changelog.
> > while not addressing the more generic reclaim
> > inversion problem outside of a very specific cgroup application.
> >
>
> But I have a different understanding. This method works like a
> knob. If you really care about your workingset (data), you should
> turn it on (i.e. by using memcg protection to protect them), while
> if you don't care about your workingset (data) then you'd better
> turn it off. That would be more flexible. Regaring your case in the
> commit log, why not protect your linux git tree with memcg
> protection ?
I can't imagine a scenario where I *wouldn't* care about my
workingset, though. Why should it be opt-in, not the default?
Powered by blists - more mailing lists