lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 29 Dec 2020 09:13:27 -0800
From:   Roman Gushchin <guro@...com>
To:     Feng Tang <feng.tang@...el.com>
CC:     Andrew Morton <akpm@...ux-foundation.org>,
        Michal Hocko <mhocko@...e.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        <linux-mm@...ck.org>, <linux-kernel@...r.kernel.org>,
        <andi.kleen@...el.com>, <tim.c.chen@...el.com>,
        <dave.hansen@...el.com>, <ying.huang@...el.com>,
        Shakeel Butt <shakeelb@...gle.com>
Subject: Re: [PATCH 2/2] mm: memcg: add a new MEMCG_UPDATE_BATCH

On Tue, Dec 29, 2020 at 10:35:14PM +0800, Feng Tang wrote:
> When profiling memory cgroup involved benchmarking, status update
> sometimes take quite some CPU cycles. Current MEMCG_CHARGE_BATCH
> is used for both charging and statistics/events updating, and is
> set to 32, which may be good for accuracy of memcg charging, but
> too small for stats update which causes concurrent access to global
> stats data instead of per-cpu ones.
> 
> So handle them differently, by adding a new bigger batch number
> for stats updating, while keeping the value for charging (though
> comments in memcontrol.h suggests to consider a bigger value too)
> 
> The new batch is set to 512, which considers 2MB huge pages (512
> pages), as the check logic mostly is:
> 
>     if (x > BATCH), then skip updating global data
> 
> so it will save 50% global data updating for 2MB pages
> 
> Following are some performance data with the patch, against
> v5.11-rc1, on several generations of Xeon platforms. Each category
> below has several subcases run on different platform, and only the
> worst and best scores are listed:
> 
> fio:				 +2.0% ~  +6.8%
> will-it-scale/malloc:		 -0.9% ~  +6.2%
> will-it-scale/page_fault1:	 no change
> will-it-scale/page_fault2:	+13.7% ~ +26.2%

I wonder if there are any wins noticeable in the real world?
Lowering the accuracy of statistics makes harder to interpret it,
so it should be very well justified.

512 * nr_cpus is a large number.

> 
> One thought is it could be dynamically calculated according to
> memcg limit and number of CPUs, and another is to add a periodic
> syncing of the data for accuracy reason similar to vmstat, as
> suggested by Ying.

It sounds good to me, but it's quite tricky to implement properly,
given that thee number of cgroups can be really big. It makes the
traversing of the whole cgroup tree and syncing stats quite expensive,
so it will not be easy to find a good balance.

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ