lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 13 May 2020 08:35:45 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Balbir Singh <bsingharora@...il.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Alex Shi <alex.shi@...ux.alibaba.com>,
        Joonsoo Kim <js1304@...il.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Hugh Dickins <hughd@...gle.com>,
        Michal Hocko <mhocko@...e.com>,
        "Kirill A. Shutemov" <kirill@...temov.name>,
        Roman Gushchin <guro@...com>, linux-mm@...ck.org,
        cgroups@...r.kernel.org, linux-kernel@...r.kernel.org,
        kernel-team@...com
Subject: Re: [PATCH 00/19 V2] mm: memcontrol: charge swapin pages on
 instantiation

Hello Balbir!

On Wed, May 13, 2020 at 11:30:32AM +0000, Balbir Singh wrote:
> On Fri, May 08, 2020 at 02:30:47PM -0400, Johannes Weiner wrote:
> > To eliminate the page->mapping dependency, memcg needs to ditch its
> > private page type counters (MEMCG_CACHE, MEMCG_RSS, NR_SHMEM) in favor
> > of the generic vmstat counters and accounting sites, such as
> > NR_FILE_PAGES, NR_ANON_MAPPED etc.
> 
> Could you elaborate on what this means and the implications of this on
> user space programs?

This has no bearing on userspace. It's just simplifying how
memory.stat is implemented. The output is the same.

For the full story:

In the past, memcg has done its own accounting to produce a breakdown
of consumers in memory.stat. When a page was charged, we relied on
knowing whether it's a file, anon or shmem page, and had our own
MEMCG_RSS, MEMCG_CACHE, MEMCG_SHMEM counters.

As the general VM code already does this type of classification to
produce /proc/vmstat, this meant unnecessary duplication: more places
to bump counters, more places that have to make sure the page state is
stable in all the right ways, more dependencies on when it's safe to
call the charge and the uncharge callbacks.

A while ago we added per-cgroup arrays of the vmstat counters and a
cgroup-aware accounting callback (mod_lruvec_state) that can be a
drop-in replacement for the generic VM code (mod_node_state and
friends). We already had some counters converted over to that.

These patches just do more of that conversion from private memcg
accounting to having callbacks into generic VM accounting sites.

Instead of testing PageAnon() and accounting MEMCG_CACHE/MEMCG_RSS in
the charge code, we switch __add_to_page_cache_locked() and
page_add_new_anon_rmap() to the cgroup-aware mod_lruvec_page_state()
to bump our per-cgroup NR_FILE_PAGES and NR_ANON_MAPPED counters along
with the node and global counters.

As a result, the memcg gets a breakdown for memory.stat without having
to have private knowledge of what a page cache page is - how to test
it, when it's safe to test it, whether there can be huge pages in the
page cache, etc. pp. Memcg can focus on counting bytes, and the VM
code that is specialized in dealing with the page cache (or anon
pages, or shmem pages) can fill in those kinds of details for us.

Less dependencies, less duplication, simpler API rules.

The memory.stat output is the same, it's just much simpler code.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ