lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190904231350.GA5246@tower.dhcp.thefacebook.com>
Date:   Wed, 4 Sep 2019 23:13:54 +0000
From:   Roman Gushchin <guro@...com>
To:     Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
CC:     "linux-mm@...ck.org" <linux-mm@...ck.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>,
        Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>
Subject: Re: [PATCH v1 0/7] mm/memcontrol: recharge mlocked pages

On Wed, Sep 04, 2019 at 04:53:08PM +0300, Konstantin Khlebnikov wrote:
> Currently mlock keeps pages in cgroups where they were accounted.
> This way one container could affect another if they share file cache.
> Typical case is writing (downloading) file in one container and then
> locking in another. After that first container cannot get rid of cache.

Yeah, it's a valid problem, and it's not about mlocked pages only,
the same thing is true for generic pagecache. The only difference is that
in theory memory pressure should fix everything. But in reality
pagecache used by the second container can be very hot, so the first
once can't really get rid of it.
In other words, there is no way to pass a pagecache page between cgroups
without evicting it and re-reading from a storage, which is sub-optimal
in many cases.

We thought about new madvise(), which will uncharge pagecache but set
a new page flag, which will mean something like "whoever first starts using
the page, should be charged for it". But it never materialized in a patchset.

> Also removed cgroup stays pinned by these mlocked pages.

Tbh, I don't think it's a big issue here. If only there is a huge number
of 1-page sized mlock areas, but this seems to be unlikely.

> 
> This patchset implements recharging pages to cgroup of mlock user.
> 
> There are three cases:
> * recharging at first mlock
> * recharging at munlock to any remaining mlock
> * recharging at 'culling' in reclaimer to any existing mlock
> 
> To keep things simple recharging ignores memory limit. After that memory
> usage temporary could be higher than limit but cgroup will reclaim memory
> later or trigger oom, which is valid outcome when somebody mlock too much.

OOM is a concern here. If quitting an application will cause an immediate OOM
in an other cgroup, that's not so good. Ideally it should work like
memory.high, forcing all threads in the second cgroup into direct reclaim.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ