lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 13 Feb 2020 09:47:29 +0800
From:   Yafang Shao <laoar.shao@...il.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     linux-fsdevel@...r.kernel.org, Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Dave Chinner <david@...morbit.com>,
        Michal Hocko <mhocko@...e.com>, Roman Gushchin <guro@...com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Al Viro <viro@...iv.linux.org.uk>,
        Kernel Team <kernel-team@...com>
Subject: Re: [PATCH] vfs: keep inodes with page cache off the inode shrinker LRU

On Thu, Feb 13, 2020 at 12:42 AM Johannes Weiner <hannes@...xchg.org> wrote:
>
> On Wed, Feb 12, 2020 at 08:25:45PM +0800, Yafang Shao wrote:
> > On Wed, Feb 12, 2020 at 1:55 AM Johannes Weiner <hannes@...xchg.org> wrote:
> > > Another variant of this problem was recently observed, where the
> > > kernel violates cgroups' memory.low protection settings and reclaims
> > > page cache way beyond the configured thresholds. It was followed by a
> > > proposal of a modified form of the reverted commit above, that
> > > implements memory.low-sensitive shrinker skipping over populated
> > > inodes on the LRU [1]. However, this proposal continues to run the
> > > risk of attracting disproportionate reclaim pressure to a pool of
> > > still-used inodes,
> >
> > Hi Johannes,
> >
> > If you really think that is a risk, what about bellow additional patch
> > to fix this risk ?
> >
> > diff --git a/fs/inode.c b/fs/inode.c
> > index 80dddbc..61862d9 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -760,7 +760,7 @@ static bool memcg_can_reclaim_inode(struct inode *inode,
> >                 goto out;
> >
> >         cgroup_size = mem_cgroup_size(memcg);
> > -       if (inode->i_data.nrpages + protection >= cgroup_size)
> > +       if (inode->i_data.nrpages)
> >                 reclaimable = false;
> >
> >  out:
> >
> > With this additional patch, we skip all inodes in this memcg until all
> > its page cache pages are reclaimed.
>
> Well that's something we've tried and had to revert because it caused
> issues in slab reclaim. See the History part of my changelog.
>

You misuderstood it.
The reverted patch skips all inodes in the system, while this patch
only works when you turn on memcg.{min, low} protection.
IOW, that is not a default behavior, while it only works when you want
it and only effect your targeted memcg rather than the whole system.

> > > while not addressing the more generic reclaim
> > > inversion problem outside of a very specific cgroup application.
> > >
> >
> > But I have a different understanding.  This method works like a
> > knob. If you really care about your workingset (data), you should
> > turn it on (i.e. by using memcg protection to protect them), while
> > if you don't care about your workingset (data) then you'd better
> > turn it off. That would be more flexible.  Regaring your case in the
> > commit log, why not protect your linux git tree with memcg
> > protection ?
>
> I can't imagine a scenario where I *wouldn't* care about my
> workingset, though. Why should it be opt-in, not the default?

Because the default behavior has caused the XFS performace hit.
(I haven't  checked your patch carefully, so I don't know whehter your
patch fix it yet.)


Thanks

Yafang

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ