linux-kernel - Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAJD7tkbSBtNJv__uZT+uh9ie=-WeqPe9oBinGOH2wuZzJMvCAw@mail.gmail.com>
Date:   Thu, 12 Oct 2023 01:04:03 -0700
From:   Yosry Ahmed <yosryahmed@...gle.com>
To:     Shakeel Butt <shakeelb@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Michal Hocko <mhocko@...nel.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Muchun Song <muchun.song@...ux.dev>,
        Ivan Babrou <ivan@...udflare.com>, Tejun Heo <tj@...nel.org>,
        Michal Koutný <mkoutny@...e.com>,
        Waiman Long <longman@...hat.com>, kernel-team@...udflare.com,
        Wei Xu <weixugc@...gle.com>, Greg Thelen <gthelen@...gle.com>,
        linux-mm@...ck.org, cgroups@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 3/5] mm: memcg: make stats flushing threshold per-memcg

On Wed, Oct 11, 2023 at 8:13 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
>
> On Wed, Oct 11, 2023 at 5:46 AM Shakeel Butt <shakeelb@...gle.com> wrote:
> >
> > On Tue, Oct 10, 2023 at 6:48 PM Yosry Ahmed <yosryahmed@...gle.com> wrote:
> > >
> > > On Tue, Oct 10, 2023 at 5:36 PM Shakeel Butt <shakeelb@...gle.com> wrote:
> > > >
> > > > On Tue, Oct 10, 2023 at 03:21:47PM -0700, Yosry Ahmed wrote:
> > > > [...]
> > > > >
> > > > > I tried this on a machine with 72 cpus (also ixion), running both
> > > > > netserver and netperf in /sys/fs/cgroup/a/b/c/d as follows:
> > > > > # echo "+memory" > /sys/fs/cgroup/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a
> > > > > # echo "+memory" > /sys/fs/cgroup/a/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a/b
> > > > > # echo "+memory" > /sys/fs/cgroup/a/b/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a/b/c
> > > > > # echo "+memory" > /sys/fs/cgroup/a/b/c/cgroup.subtree_control
> > > > > # mkdir /sys/fs/cgroup/a/b/c/d
> > > > > # echo 0 > /sys/fs/cgroup/a/b/c/d/cgroup.procs
> > > > > # ./netserver -6
> > > > >
> > > > > # echo 0 > /sys/fs/cgroup/a/b/c/d/cgroup.procs
> > > > > # for i in $(seq 10); do ./netperf -6 -H ::1 -l 60 -t TCP_SENDFILE --
> > > > > -m 10K; done
> > > >
> > > > You are missing '&' at the end. Use something like below:
> > > >
> > > > #!/bin/bash
> > > > for i in {1..22}
> > > > do
> > > >    /data/tmp/netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K &
> > > > done
> > > > wait
> > > >
> > >
> > > Oh sorry I missed the fact that you are running instances in parallel, my bad.
> > >
> > > So I ran 36 instances on a machine with 72 cpus. I did this 10 times
> > > and got an average from all instances for all runs to reduce noise:
> > >
> > > #!/bin/bash
> > >
> > > ITER=10
> > > NR_INSTANCES=36
> > >
> > > for i in $(seq $ITER); do
> > >   echo "iteration $i"
> > >   for j in $(seq $NR_INSTANCES); do
> > >     echo "iteration $i" >> "out$j"
> > >     ./netperf -6 -H ::1 -l 60 -t TCP_SENDFILE -- -m 10K >> "out$j" &
> > >   done
> > >   wait
> > > done
> > >
> > > cat out* | grep 540000 | awk '{sum += $5} END {print sum/NR}'
> > >
> > > Base: 22169 mbps
> > > Patched: 21331.9 mbps
> > >
> > > The difference is ~3.7% in my runs. I am not sure what's different.
> > > Perhaps it's the number of runs?
> >
> > My base kernel is next-20231009 and I am running experiments with
> > hyperthreading disabled.
>
> Using next-20231009 and a similar 44 core machine with hyperthreading
> disabled, I ran 22 instances of netperf in parallel and got the
> following numbers from averaging 20 runs:
>
> Base: 33076.5 mbps
> Patched: 31410.1 mbps
>
> That's about 5% diff. I guess the number of iterations helps reduce
> the noise? I am not sure.
>
> Please also keep in mind that in this case all netperf instances are
> in the same cgroup and at a 4-level depth. I imagine in a practical
> setup processes would be a little more spread out, which means less
> common ancestors, so less contended atomic operations.


(Resending the reply as I messed up the last one, was not in plain text)

I was curious, so I ran the same testing in a cgroup 2 levels deep
(i.e /sys/fs/cgroup/a/b), which is a much more common setup in my
experience. Here are the numbers:

Base: 40198.0 mbps
Patched: 38629.7 mbps

The regression is reduced to ~3.9%.

What's more interesting is that going from a level 2 cgroup to a level
4 cgroup is already a big hit with or without this patch:

Base: 40198.0 -> 33076.5 mbps (~17.7% regression)
Patched: 38629.7 -> 31410.1 (~18.7% regression)

So going from level 2 to 4 is already a significant regression for
other reasons (e.g. hierarchical charging). This patch only makes it
marginally worse. This puts the numbers more into perspective imo than
comparing values at level 4. What do you think?