lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a31c7c6c-ea82-b690-3504-2133178efdaa@redhat.com>
Date:   Tue, 3 Nov 2020 21:46:26 -0500
From:   Waiman Long <longman@...hat.com>
To:     Xing Zhengjun <zhengjun.xing@...ux.intel.com>,
        Michal Hocko <mhocko@...e.com>,
        Rong Chen <rong.a.chen@...el.com>
Cc:     Linus Torvalds <torvalds@...ux-foundation.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Shakeel Butt <shakeelb@...gle.com>,
        Chris Down <chris@...isdown.name>,
        Johannes Weiner <hannes@...xchg.org>,
        Roman Gushchin <guro@...com>, Tejun Heo <tj@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Yafang Shao <laoar.shao@...il.com>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        lkp@...el.com, zhengjun.xing@...el.com
Subject: Re: [LKP] Re: [mm/memcg] bd0b230fe1: will-it-scale.per_process_ops
 -22.7% regression

On 11/3/20 8:20 PM, Xing Zhengjun wrote:
>
>
> On 11/2/2020 6:02 PM, Michal Hocko wrote:
>> On Mon 02-11-20 17:53:14, Rong Chen wrote:
>>>
>>>
>>> On 11/2/20 5:27 PM, Michal Hocko wrote:
>>>> On Mon 02-11-20 17:15:43, kernel test robot wrote:
>>>>> Greeting,
>>>>>
>>>>> FYI, we noticed a -22.7% regression of 
>>>>> will-it-scale.per_process_ops due to commit:
>>>>>
>>>>>
>>>>> commit: bd0b230fe14554bfffbae54e19038716f96f5a41 ("mm/memcg: unify 
>>>>> swap and memsw page counters")
>>>>> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 
>>>>> master
>>>> I really fail to see how this can be anything else than a data 
>>>> structure
>>>> layout change. There is one counter less.
>>>>
>>>> btw. are cgroups configured at all? What would be the configuration?
>>>
>>> Hi Michal,
>>>
>>> We used the default configure of cgroups, not sure what 
>>> configuration you
>>> want,
>>> could you give me more details? and here is the cgroup info of 
>>> will-it-scale
>>> process:
>>>
>>> $ cat /proc/3042/cgroup
>>> 12:hugetlb:/
>>> 11:memory:/system.slice/lkp-bootstrap.service
>>
>> OK, this means that memory controler is enabled and in use. Btw. do you
>> get the original performance if you add one phony page_counter after the
>> union?
>>
> I add one phony page_counter after the union and re-test, the 
> regression reduced to -1.2%. It looks like the regression caused by 
> the data structure layout change.

So it looks like the regression is caused by false cacheline sharing of 
two or more hot items in mem_cgroup. As the size of the page_counter is 
112 bytes, eliminating one counter will shift down the cacheline 
boundary by 16 bytes. We probably need to use perf to find out what 
those hot items are for this particular benchmark.

Cheers,
Longman

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ