linux-kernel - Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkbrR=6SmVxo4pVKHVu4eGBYN+xXuu5+zFPh6LSqt8vGcw@mail.gmail.com>
Date:   Thu, 12 Oct 2023 16:28:49 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Johannes Weiner <hannes@...xchg.org>
Cc:     Shakeel Butt <shakeelb@...gle.com>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...nel.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <muchun.song@...ux.dev>,
        Ivan Babrou <ivan@...udflare.com>, Tejun Heo <tj@...nel.org>,
        Michal Koutný <mkoutny@...e.com>,
        Waiman Long <longman@...hat.com>, kernel-team@...udflare.com,
        Wei Xu <weixugc@...gle.com>, Greg Thelen <gthelen@...gle.com>,
        linux-mm@...ck.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

[..]
> > >
> > > Using next-20231009 and a similar 44 core machine with hyperthreading
> > > disabled, I ran 22 instances of netperf in parallel and got the
> > > following numbers from averaging 20 runs:
> > >
> > > Base: 33076.5 mbps
> > > Patched: 31410.1 mbps
> > >
> > > That's about 5% diff. I guess the number of iterations helps reduce
> > > the noise? I am not sure.
> > >
> > > Please also keep in mind that in this case all netperf instances are
> > > in the same cgroup and at a 4-level depth. I imagine in a practical
> > > setup processes would be a little more spread out, which means less
> > > common ancestors, so less contended atomic operations.
> >
> >
> > (Resending the reply as I messed up the last one, was not in plain text)
> >
> > I was curious, so I ran the same testing in a cgroup 2 levels deep
> > (i.e /sys/fs/cgroup/a/b), which is a much more common setup in my
> > experience. Here are the numbers:
> >
> > Base: 40198.0 mbps
> > Patched: 38629.7 mbps
> >
> > The regression is reduced to ~3.9%.
> >
> > What's more interesting is that going from a level 2 cgroup to a level
> > 4 cgroup is already a big hit with or without this patch:
> >
> > Base: 40198.0 -> 33076.5 mbps (~17.7% regression)
> > Patched: 38629.7 -> 31410.1 (~18.7% regression)
> >
> > So going from level 2 to 4 is already a significant regression for
> > other reasons (e.g. hierarchical charging). This patch only makes it
> > marginally worse. This puts the numbers more into perspective imo than
> > comparing values at level 4. What do you think?
>
> I think it's reasonable.
>
> Especially comparing to how many cachelines we used to touch on the
> write side when all flushing happened there. This looks like a good
> trade-off to me.

Thanks.

Still wanting to figure out if this patch is what you suggested in our
previous discussion [1], to add a
Suggested-by if appropriate :)

[1]https://lore.kernel.org/lkml/20230913153758.GB45543@cmpxchg.org/