[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <12cba05a-e268-3a5d-69d7-feb00e36ef40@redhat.com>
Date: Thu, 15 Apr 2021 09:17:37 -0400
From: Waiman Long <llong@...hat.com>
To: Masayoshi Mizuma <msys.mizuma@...il.com>
Cc: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Vlastimil Babka <vbabka@...e.cz>, Roman Gushchin <guro@...com>,
linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
linux-mm@...ck.org, Shakeel Butt <shakeelb@...gle.com>,
Muchun Song <songmuchun@...edance.com>,
Alex Shi <alex.shi@...ux.alibaba.com>,
Chris Down <chris@...isdown.name>,
Yafang Shao <laoar.shao@...il.com>,
Wei Yang <richard.weiyang@...il.com>,
Xing Zhengjun <zhengjun.xing@...ux.intel.com>
Subject: Re: [PATCH v3 0/5] mm/memcg: Reduce kmemcache memory accounting
overhead
On 4/14/21 11:26 PM, Masayoshi Mizuma wrote:
>
> Hi Longman,
>
> Thank you for your patches.
> I rerun the benchmark with your patches, it seems that the reduction
> is small... The total duration of sendto() and recvfrom() system call
> during the benchmark are as follows.
>
> - sendto
> - v5.8 vanilla: 2576.056 msec (100%)
> - v5.12-rc7 vanilla: 2988.911 msec (116%)
> - v5.12-rc7 with your patches (1-5): 2984.307 msec (115%)
>
> - recvfrom
> - v5.8 vanilla: 2113.156 msec (100%)
> - v5.12-rc7 vanilla: 2305.810 msec (109%)
> - v5.12-rc7 with your patches (1-5): 2287.351 msec (108%)
>
> kmem_cache_alloc()/kmem_cache_free() are called around 1,400,000 times during
> the benchmark. I ran a loop in a kernel module as following. The duration
> is reduced by your patches actually.
>
> ---
> dummy_cache = KMEM_CACHE(dummy, SLAB_ACCOUNT);
> for (i = 0; i < 1400000; i++) {
> p = kmem_cache_alloc(dummy_cache, GFP_KERNEL);
> kmem_cache_free(dummy_cache, p);
> }
> ---
>
> - v5.12-rc7 vanilla: 110 msec (100%)
> - v5.12-rc7 with your patches (1-5): 85 msec (77%)
>
> It seems that the reduction is small for the benchmark though...
> Anyway, I can see your patches reduce the overhead.
> Please feel free to add:
>
> Tested-by: Masayoshi Mizuma <m.mizuma@...fujitsu.com>
>
> Thanks!
> Masa
>
Thanks for the testing.
I was focusing on your kernel module benchmark in testing my patch. I
will try out your pgbench benchmark to see if there can be other tuning
that can be done.
BTW, how many numa nodes does your test machine? I did my testing with a
2-socket system. The vmstat caching part may be less effective on
systems with more numa nodes. I will try to find a larger 4-socket
systems for testing.
Cheers,
Longman
Powered by blists - more mailing lists