[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211126031748.GA11450@xsang-OptiPlex-9020>
Date: Fri, 26 Nov 2021 11:17:48 +0800
From: Oliver Sang <oliver.sang@...el.com>
To: Michal Hocko <mhocko@...e.com>
Cc: Shakeel Butt <shakeelb@...gle.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Arnd Bergmann <arnd@...db.de>, Roman Gushchin <guro@...com>,
Muchun Song <songmuchun@...edance.com>,
Vasily Averin <vvs@...tuozzo.com>,
Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
lkp@...el.com, ying.huang@...el.com, feng.tang@...el.com,
zhengjun.xing@...ux.intel.com, fengwei.yin@...el.com
Subject: Re: [memcg, kmem] 58056f7750: hackbench.throughput 10.3%
improvement
Hi Michal Hocko,
On Wed, Nov 24, 2021 at 06:01:12PM +0100, Michal Hocko wrote:
> On Wed 24-11-21 16:34:35, kernel test robot wrote:
> >
> >
> > Greeting,
> >
> > FYI, we noticed a 10.3% improvement of hackbench.throughput due to commit:
> >
> >
> > commit: 58056f77502f3567b760c9a8fc8d2e9081515b2d ("memcg, kmem: further deprecate kmem.limit_in_bytes")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> I am really surprised to see an improvement from this patch. I do not
> expect your benchmarking would be using kmem limit. The above patch
> hasn't really removed the page counter out of the picture so there
> shouldn't be any real reason for performance improvement. I strongly
> suspect this is just some benchmark artifact or unreliable evaluation.
Fengwei Yin helped further analyze this improvement.
The patch changed the behavior of function obj_cgroup_charge_pages. It's shown
in the perf-callstack as following line:
5.63 ± 11% -5.6 0.00 perf-profile.calltrace.cycles-pp.page_counter_try_charge.obj_cgroup_charge_pages.obj_cgroup_charge.kmem_cache_alloc_node.__alloc_skb
So Fengwei prepared a patch which reverting the changes in
obj_cgroup_charge_pages in 58056f7750 (as attached mod.patch)
by this patch, the performance is similar to 16f6bf266c, the improvement
disappear.
=========================================================================================
compiler/cpufreq_governor/ipc/iterations/kconfig/mode/nr_threads/rootfs/tbox_group/testcase/ucode:
gcc-9/performance/socket/4/x86_64-rhel-8.3/process/100%/debian-10.4-x86_64-20200603.cgz/lkp-cpl-4sp1/hackbench/0x700001e
commit:
16f6bf266c ("mm/list_lru.c: prefer struct_size over open coded arithmetic")
58056f7750 ("memcg, kmem: further deprecate kmem.limit_in_bytes")
ae12af515d ('58056f7750' minus 'changes in obj_cgroup_charge_pages', attached mod.patch)
16f6bf266c94017c 58056f77502f3567b760c9a8fc8 ae12af515da0d557c25f86e89b0
---------------- --------------------------- ---------------------------
%stddev %change %stddev %change %stddev
\ | \ | \
124966 +8.8% 136017 ± 2% -0.1% 124791 ± 2% hackbench.throughput
...
5.41 ± 12% -5.4 0.00 +0.3 5.73 ± 13% perf-profile.calltrace.cycles-pp.page_counter_try_charge.obj_cgroup_charge_pages.obj_cgroup_charge.kmem_cache_alloc_node.__alloc_skb
detail comparison data as attached 16f6b-58056-ae12a
in brief, the result prove what we suspect. The original patch removed code
- !page_counter_try_charge(&memcg->kmem, nr_pages, &counter)) {
which improved the hackbench throughput. Thanks.
> --
> Michal Hocko
> SUSE Labs
View attachment "mod.patch" of type "text/x-diff" (4959 bytes)
View attachment "16f6b-58056-ae12a" of type "text/plain" (715164 bytes)
Powered by blists - more mailing lists