lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YNTE88wHs4Ac/DKp@cmpxchg.org>
Date:   Thu, 24 Jun 2021 13:46:27 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Tejun Heo <tj@...nel.org>, Muchun Song <songmuchun@...edance.com>,
        Michal Hocko <mhocko@...nel.org>, Roman Gushchin <guro@...com>,
        Michal Koutný <mkoutny@...e.com>,
        Huang Ying <ying.huang@...el.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Cgroups <cgroups@...r.kernel.org>, Linux MM <linux-mm@...ck.org>,
        LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v2 2/2] memcg: periodically flush the memcg stats

Hey Shakeel,

Sorry about the delay.

On Tue, Jun 15, 2021 at 02:52:37PM -0700, Shakeel Butt wrote:
> On Tue, Jun 15, 2021 at 12:29 PM Johannes Weiner <hannes@...xchg.org> wrote:
> > The way the global vmstat implementation manages error is doing both:
> > ratelimiting and timelimiting. It uses percpu batching to limit the
> > error when it gets busy, and periodic flushing to limit the length of
> > time consumers of those stats could be stuck trying to reach a state
> > that the batching would otherwise prevent from being reflected.
> >
> > Maybe we can use a combination of ratelimiting and timelimiting too?
> >
> > We shouldn't flush on every fault, but what about a percpu ratelimit
> > that would at least bound the error to NR_CPU instead of nr_cgroups?
> >
> 
> Couple questions here:
> 
> First, to convert the error bound to NR_CPU from nr_cgroups, I think
> we have to move from (delta > threshold) comparison to
> (num_update_events > threshold). Previously an increment event
> followed by decrement would keep the delta to 0 (or same) but after
> this change num_update_events would be 2. Is that ok?

Yeah, I think that's fine. Or at least I can't think of a real-world
application that would inc and dec the same counter over and over and
so would do much better with delta spilling over event ratelimiting.

And the ratelimiting should already ensure by itself that the cost is
at least acceptable when continuously updating and reading counters.

> Second, do we want to synchronously flush the stats when we cross the
> threshold on update or asynchronously by queuing the flush with zero
> delay?

I think flushing by worker is better because we can see updates from
all sorts of contexts with all sorts of locks held. That could make
for some difficult dependencies and latency sources when serializing
those on cgroup_rstat_lock.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ