lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Sep 2021 12:53:06 +0200
From:   Michal Koutný <mkoutny@...e.com>
To:     Feng Tang <feng.tang@...el.com>
Cc:     Andi Kleen <ak@...ux.intel.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        andi.kleen@...el.com, kernel test robot <oliver.sang@...el.com>,
        Roman Gushchin <guro@...com>, Michal Hocko <mhocko@...e.com>,
        Shakeel Butt <shakeelb@...gle.com>,
        Balbir Singh <bsingharora@...il.com>,
        Tejun Heo <tj@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        kernel test robot <lkp@...el.com>,
        "Huang, Ying" <ying.huang@...el.com>,
        Zhengjun Xing <zhengjun.xing@...ux.intel.com>
Subject: Re: [mm] 2d146aa3aa: vm-scalability.throughput -36.4% regression

Hi.

On Thu, Sep 02, 2021 at 11:46:28AM +0800, Feng Tang <feng.tang@...el.com> wrote:
> > Narrowing it down to a single prefetcher seems good enough to me. The
> > behavior of the prefetchers is fairly complicated and hard to predict, so I
> > doubt you'll ever get a 100% step by step explanation.
 
My layman explanation with the available information is that the
prefetcher somehow behaves as if it marked the offending cacheline as
modified (even though reading only) therefore slowing down the remote reader.


On Thu, Sep 02, 2021 at 09:35:58AM +0800, Feng Tang <feng.tang@...el.com> wrote:
> @@ -139,10 +139,21 @@ struct cgroup_subsys_state {
>       /* PI: the cgroup that this css is attached to */
>       struct cgroup *cgroup;
>
> +     struct cgroup_subsys_state *parent;
> +
>       /* PI: the cgroup subsystem that this css is attached to */
>       struct cgroup_subsys *ss;

Hm, an interesting move; be mindful of commit b8b1a2e5eca6 ("cgroup:
move cgroup_subsys_state parent field for cache locality"). It might be
a regression for systems with cpuacct root css present. (That is likely
a big amount nowadays, that may be the reason why you don't see full
recovery?  For future, we may at least guard cpuacct_charge() with
cgroup_subsys_enabled() static branch.)

> [snip]
> Yes, I'm afriad so, given that the policy/algorithm used by perfetcher
> keeps changing from generation to generation.

Exactly. I'm afraid of relayouting the structure with each new
generation. A robust solution is putting all frequently accessed members
into individual cache-lines + separating them with one more cache line? :-/


Michal

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ