lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Fri, 23 Mar 2018 10:46:13 +1100
From:   Dave Chinner <david@...morbit.com>
To:     Kirill Tkhai <ktkhai@...tuozzo.com>
Cc:     Michal Hocko <mhocko@...nel.org>, darrick.wong@...cle.com,
        linux-xfs@...r.kernel.org, linux-kernel@...r.kernel.org,
        akpm@...ux-foundation.org
Subject: Re: [PATCH RFC] xfs, memcg: Call xfs_fs_nr_cached_objects() only in
 case of global reclaim

On Thu, Mar 22, 2018 at 07:52:37PM +0300, Kirill Tkhai wrote:
> Here is the problem I'm solving: https://lkml.org/lkml/2018/3/21/365.

Oh, finally you tell me what the problem is that you're trying to
solve. I *asked this several times* and got no response. Thank you
for wasting so much of my time.

> Current shrinker is not scalable. Then there are many memcg and mounts,
> the time of iteration shrink_slab() in case of global reclaim can
> take much time. There is times of shrink_slab() by the link. A node
> with 200 containers may waste 4 seconds on global reclaim just to
> iterate over all shrinkers for all cgroups, call shrinker::count_objects()
> and receive 0 zero objects.

So, your problem is the way the memcgs were tacked onto the side
of the list_lru infrastructure and are iterated, which has basically
nothing to do with the way the low level XFS inode shrinker behaves.

/me looks at the patches

/me shudders at the thought of external "cache has freeable items"
control for the shrinking of vfs caches.

Biggest problem I see with this is the scope for coherency bugs ini
the "memcg shrinker has freeable items" tracking. If that happens,
there's no way of getting that memcg to run it's shrinker ever
again. That seems very, very fragile and likely to be an endless
source of OOM bugs. The whole point of the shrinker polling
infrastructure is that it is not susceptible to this sort of bug.

Really, the problem is that there's no separate list of memcg aware
shrinkers, so every registered shrinker has to be iterated just to
find the one shrinker that is memcg aware.

So why not just do the simple thing which is create a separate
"memcg-aware" shrinker list (i.e. create shrinker_list_memcg
alongside shrinker_list) and only iterate the shrinker_list_memcg
when a memcg is passed to shrink_slab()?

That means we'll only run 2 shrinkers per memcg at most (sueprblock
and working set) per memcg reclaim call. That's a simple 10-20 line
change, not a whole mess of new code and issues...

> Can't we call shrink of shared objects only for top memcg? Something like
> this:

What's a "shared object", and how is it any different to a normal
slab cache object?

CHeers,

Dave.
-- 
Dave Chinner
david@...morbit.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ