lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 16 Nov 2023 02:24:05 +0000
From:   Yosry Ahmed <>
To:     Andrew Morton <>
Cc:     Johannes Weiner <>,
        Michal Hocko <>,
        Roman Gushchin <>,
        Shakeel Butt <>,
        Muchun Song <>,
        Ivan Babrou <>, Tejun Heo <>,
        "Michal Koutný" <>,
        Waiman Long <>,,
        Wei Xu <>, Greg Thelen <>,
        Domenico Cerasuolo <>,,,, Yosry Ahmed <>
Subject: [PATCH v3 0/5] mm: memcg: subtree stats flushing and thresholds

This series attempts to address shortages in today's approach for memcg
stats flushing, namely occasionally stale or expensive stat reads. The
series does so by changing the threshold that we use to decide whether
to trigger a flush to be per memcg instead of global (patch 3), and then
changing flushing to be per memcg (i.e. subtree flushes) instead of
global (patch 5).

Patch 3 & 5 are the core of the series, and they include more details
and testing results. The rest are either cleanups or prep work.

This series replaces the "memcg: more sophisticated stats flushing"
series [1], which also replaces another series, in a long list of
attempts to improve memcg stats flushing. It is not a new version of
the same patchset as it is a completely different approach. This is
based on collected feedback from discussions on lkml in all previous
attempts. Hopefully, this is the final attempt.

There was a reported regression in v2 [2] for will-it-scale::fallocate
benchmark. I believe this regression should not affect production
workloads. This specific benchmark is allocating and freeing memory
(using fallocate/ftruncate) at a rate that is much faster to make actual
use of the memory. Testing this series on 100+ machines running
production workloads did not show any practical regressions in page
fault latency or allocation latency, but it showed great improvements in
stats read time. I do not have numbers about the exact improvements for
this series, but combined with another optimization for cgroup v1 [3] we
see 5-10x improvements. A significant chunk of that is coming from the
cgroup v1 optimization, but this series also made an improvement as
reported by Domenico [4].


v2 -> v3:
- Rebased on top of v6.7-rc1.
- Updated commit messages based on discussions in previous versions.
- Reset percpu stats_updates in mem_cgroup_css_rstat_flush().
- Added a mem_cgroup_disabled() check to mem_cgroup_flush_stats().


Yosry Ahmed (5):
  mm: memcg: change flush_next_time to flush_last_time
  mm: memcg: move vmstats structs definition above flushing code
  mm: memcg: make stats flushing threshold per-memcg
  mm: workingset: move the stats flush into workingset_test_recent()
  mm: memcg: restore subtree stats flushing

 include/linux/memcontrol.h |   8 +-
 mm/memcontrol.c            | 272 +++++++++++++++++++++----------------
 mm/vmscan.c                |   2 +-
 mm/workingset.c            |  42 ++++--
 4 files changed, 188 insertions(+), 136 deletions(-)


Powered by blists - more mailing lists