linux-kernel - Re: [net] 4890b686f4: netperf.Throughput

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89i+6NPujMyiQxriZRt6vhv6hNrAntXxi1uOhJ0SSqnJ47w@mail.gmail.com>
Date:   Mon, 27 Jun 2022 10:46:21 +0200
From:   Eric Dumazet <edumazet@...gle.com>
To:     Feng Tang <feng.tang@...el.com>
Cc:     Shakeel Butt <shakeelb@...gle.com>, Linux MM <linux-mm@...ck.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Roman Gushchin <roman.gushchin@...ux.dev>,
        Michal Hocko <mhocko@...nel.org>,
        Johannes Weiner <hannes@...xchg.org>,
        Muchun Song <songmuchun@...edance.com>,
        Jakub Kicinski <kuba@...nel.org>,
        Xin Long <lucien.xin@...il.com>,
        Marcelo Ricardo Leitner <marcelo.leitner@...il.com>,
        kernel test robot <oliver.sang@...el.com>,
        Soheil Hassas Yeganeh <soheil@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        network dev <netdev@...r.kernel.org>,
        linux-s390@...r.kernel.org, MPTCP Upstream <mptcp@...ts.linux.dev>,
        "linux-sctp @ vger . kernel . org" <linux-sctp@...r.kernel.org>,
        lkp@...ts.01.org, kbuild test robot <lkp@...el.com>,
        Huang Ying <ying.huang@...el.com>,
        Xing Zhengjun <zhengjun.xing@...ux.intel.com>,
        Yin Fengwei <fengwei.yin@...el.com>, Ying Xu <yinxu@...hat.com>
Subject: Re: [net] 4890b686f4: netperf.Throughput_Mbps -69.4% regression

On Mon, Jun 27, 2022 at 4:38 AM Feng Tang <feng.tang@...el.com> wrote:
>
> On Sat, Jun 25, 2022 at 10:36:42AM +0800, Feng Tang wrote:
> > On Fri, Jun 24, 2022 at 02:43:58PM +0000, Shakeel Butt wrote:
> > > On Fri, Jun 24, 2022 at 03:06:56PM +0800, Feng Tang wrote:
> > > > On Thu, Jun 23, 2022 at 11:34:15PM -0700, Shakeel Butt wrote:
> > > [...]
> > > > >
> > > > > Feng, can you please explain the memcg setup on these test machines
> > > > > and if the tests are run in root or non-root memcg?
> > > >
> > > > I don't know the exact setup, Philip/Oliver from 0Day can correct me.
> > > >
> > > > I logged into a test box which runs netperf test, and it seems to be
> > > > cgoup v1 and non-root memcg. The netperf tasks all sit in dir:
> > > > '/sys/fs/cgroup/memory/system.slice/lkp-bootstrap.service'
> > > >
> > >
> > > Thanks Feng. Can you check the value of memory.kmem.tcp.max_usage_in_bytes
> > > in /sys/fs/cgroup/memory/system.slice/lkp-bootstrap.service after making
> > > sure that the netperf test has already run?
> >
> > memory.kmem.tcp.max_usage_in_bytes:0
>
> Sorry, I made a mistake that in the original report from Oliver, it
> was 'cgroup v2' with a 'debian-11.1' rootfs.
>
> When you asked about cgroup info, I tried the job on another tbox, and
> the original 'job.yaml' didn't work, so I kept the 'netperf' test
> parameters and started a new job which somehow run with a 'debian-10.4'
> rootfs and acutally run with cgroup v1.
>
> And as you mentioned cgroup version does make a big difference, that
> with v1, the regression is reduced to 1% ~ 5% on different generations
> of test platforms. Eric mentioned they also got regression report,
> but much smaller one, maybe it's due to the cgroup version?

This was using the current net-next tree.
Used recipe was something like:

Make sure cgroup2 is mounted or mount it by mount -t cgroup2 none $MOUNT_POINT.
Enable memory controller by echo +memory > $MOUNT_POINT/cgroup.subtree_control.
Create a cgroup by mkdir $MOUNT_POINT/job.
Jump into that cgroup by echo $$ > $MOUNT_POINT/job/cgroup.procs.

<Launch tests>

The regression was smaller than 1%, so considered noise compared to
the benefits of the bug fix.

>
> Thanks,
> Feng
>
> > And here is more memcg stats (let me know if you want to check more)
> >
> > > If this is non-zero then network memory accounting is enabled and the
> > > slowdown is expected.
> >
> > >From the perf-profile data in original report, both
> > __sk_mem_raise_allocated() and __sk_mem_reduce_allocated() are called
> > much more often, which call memcg charge/uncharge functions.
> >
> > IIUC, the call chain is:
> >
> > __sk_mem_raise_allocated
> >     sk_memory_allocated_add
> >     mem_cgroup_charge_skmem
> >         charge memcg->tcpmem (for cgroup v2)
> >       try_charge memcg (for v1)
> >
> > Also from Eric's one earlier commit log:
> >
> > "
> > net: implement per-cpu reserves for memory_allocated
> > ...
> > This means we are going to call sk_memory_allocated_add()
> > and sk_memory_allocated_sub() more often.
> > ...
> > "
> >
> > So this slowdown is related to the more calling of charge/uncharge?
> >
> > Thanks,
> > Feng
> >
> > > > And the rootfs is a debian based rootfs
> > > >
> > > > Thanks,
> > > > Feng
> > > >
> > > >
> > > > > thanks,
> > > > > Shakeel