lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20190412220357.GA18999@tower.DHCP.thefacebook.com> Date: Fri, 12 Apr 2019 22:04:02 +0000 From: Roman Gushchin <guro@...com> To: Johannes Weiner <hannes@...xchg.org> CC: Andrew Morton <akpm@...ux-foundation.org>, "linux-mm@...ck.org" <linux-mm@...ck.org>, "cgroups@...r.kernel.org" <cgroups@...r.kernel.org>, "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>, Kernel Team <Kernel-team@...com> Subject: Re: [PATCH 0/4] mm: memcontrol: memory.stat cost & correctness On Fri, Apr 12, 2019 at 11:15:03AM -0400, Johannes Weiner wrote: > The cgroup memory.stat file holds recursive statistics for the entire > subtree. The current implementation does this tree walk on-demand > whenever the file is read. This is giving us problems in production. > > 1. The cost of aggregating the statistics on-demand is high. A lot of > system service cgroups are mostly idle and their stats don't change > between reads, yet we always have to check them. There are also always > some lazily-dying cgroups sitting around that are pinned by a handful > of remaining page cache; the same applies to them. > > In an application that periodically monitors memory.stat in our fleet, > we have seen the aggregation consume up to 5% CPU time. > > 2. When cgroups die and disappear from the cgroup tree, so do their > accumulated vm events. The result is that the event counters at > higher-level cgroups can go backwards and confuse some of our > automation, let alone people looking at the graphs over time. > > To address both issues, this patch series changes the stat > implementation to spill counts upwards when the counters change. > > The upward spilling is batched using the existing per-cpu cache. In a > sparse file stress test with 5 level cgroup nesting, the additional > cost of the flushing was negligible (a little under 1% of CPU at 100% > CPU utilization, compared to the 5% of reading memory.stat during > regular operation). > > include/linux/memcontrol.h | 96 +++++++------- > mm/memcontrol.c | 290 +++++++++++++++++++++++++++---------------- > mm/vmscan.c | 4 +- > mm/workingset.c | 7 +- > 4 files changed, 234 insertions(+), 163 deletions(-) > > For the series: Reviewed-by: Roman Gushchin <guro@...com> Thanks!
Powered by blists - more mailing lists