lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Date: Fri, 25 Feb 2022 17:20:20 -0800 From: Andrew Morton <akpm@...ux-foundation.org> To: Shakeel Butt <shakeelb@...gle.com>, =?ISO-8859-1?Q? "Michal_Koutn=FD" ?= <mkoutny@...e.com>, Johannes Weiner <hannes@...xchg.org>, Michal Hocko <mhocko@...nel.org>, Roman Gushchin <roman.gushchin@...ux.dev>, Ivan Babrou <ivan@...udflare.com>, cgroups@...r.kernel.org, linux-mm@...ck.org, linux-kernel@...r.kernel.org, Daniel Dao <dqminh@...udflare.com>, stable@...r.kernel.org Subject: Re: [PATCH] memcg: async flush memcg stats from perf sensitive codepaths On Fri, 25 Feb 2022 16:58:42 -0800 Andrew Morton <akpm@...ux-foundation.org> wrote: > On Fri, 25 Feb 2022 16:24:12 -0800 Shakeel Butt <shakeelb@...gle.com> wrote: > > > Daniel Dao has reported [1] a regression on workloads that may trigger > > a lot of refaults (anon and file). The underlying issue is that flushing > > rstat is expensive. Although rstat flush are batched with (nr_cpus * > > MEMCG_BATCH) stat updates, it seems like there are workloads which > > genuinely do stat updates larger than batch value within short amount of > > time. Since the rstat flush can happen in the performance critical > > codepaths like page faults, such workload can suffer greatly. > > > > The easiest fix for now is for performance critical codepaths trigger > > the rstat flush asynchronously. This patch converts the refault codepath > > to use async rstat flush. In addition, this patch has premptively > > converted mem_cgroup_wb_stats and shrink_node to also use the async > > rstat flush as they may also similar performance regressions. > > Gee we do this trick a lot and gee I don't like it :( > > a) if we're doing too much work then we're doing too much work. > Punting that work over to a different CPU or thread doesn't alter > that - it in fact adds more work. > > b) there's an assumption here that the flusher is able to keep up > with the producer. What happens if that isn't the case? Do we > simply wind up the deferred items until the system goes oom? > > What happens if there's a producer running on each CPU? Can the > flushers keep up? > > Pathologically, what happens if the producer is running > task_is_realtime() on a single-CPU system? Or if there's a > task_is_realtime() producer running on every CPU? The flusher never > gets to run and we're dead? Not some theoretical thing, btw. See how __read_swap_cache_async() just got its sins exposed by real-time tasks: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com
Powered by blists - more mailing lists