lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Sun, 23 Feb 2020 22:11:47 +0800
From:   Feng Tang <feng.tang@...el.com>
To:     Jiri Olsa <jolsa@...hat.com>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        kernel test robot <rong.a.chen@...el.com>,
        Ingo Molnar <mingo@...nel.org>,
        Vince Weaver <vincent.weaver@...ne.edu>,
        Jiri Olsa <jolsa@...nel.org>,
        Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
        Arnaldo Carvalho de Melo <acme@...nel.org>,
        Arnaldo Carvalho de Melo <acme@...hat.com>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        "Naveen N. Rao" <naveen.n.rao@...ux.vnet.ibm.com>,
        Ravi Bangoria <ravi.bangoria@...ux.ibm.com>,
        Stephane Eranian <eranian@...gle.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        andi.kleen@...el.com, ying.huang@...el.com
Subject: Re: [LKP] Re: [perf/x86] 81ec3f3c4c: will-it-scale.per_process_ops
 -5.5% regression

Hi Jiri,

On Fri, Feb 21, 2020 at 02:20:48PM +0100, Jiri Olsa wrote:

> > We are also curious that the commit seems to be completely not
> > relative to this scalability test of signal, which starts a task
> > for each online CPU, and keeps calling raise(), and calculating
> > the run numbers.
> > 
> > One experiment we did is checking which part of the commit
> > really affects the test, and it turned out to be the change of
> > "struct pmu". Effectively, applying this patch upon 5.0-rc6 
> > which triggers the same regression.
> > So likely, this commit changes the layout of the kernel text
> > and data, which may trigger some cacheline level change. From
> > the system map of the 2 kernels, a big trunk of symbol's address
> > changes which follow the global "pmu",
> 
> nice, I wonder we could see that in perf c2c output ;-)
> I'll try to run and check

Thanks for the "perf c2c" suggestion. 

I tried to use perf-c2c on one platform (not the one that show
the 5.5% regression), and found the main "hitm" points to the
"root_user" global data, as there is a task for each CPU doing
the signal stress test, and both __sigqueue_alloc() and
__sigqueue_free() will call get_user() and free_uid() to inc/dec
this root_user's refcount.

Then I added some alignement inside struct "user_struct" (for
"root_user"), then the -5.5% is gone, with a +2.6% instead.

One c2c report log is attached.

One thing I don't understand is, this -5.5% only happens in
one 2 sockets, 96C/192T Cascadelake platform, as we've run
the same test on several different platforms. In therory,
the false sharing may also take effect? 

Thanks,
Feng

View attachment "c2c_wis_sig_32T.log" of type "text/plain" (173969 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ