[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <YHSXvQVvzHu26u7H@carbon.dhcp.thefacebook.com>
Date: Mon, 12 Apr 2021 11:55:57 -0700
From: Roman Gushchin <guro@...com>
To: Waiman Long <longman@...hat.com>
CC: Johannes Weiner <hannes@...xchg.org>,
Michal Hocko <mhocko@...nel.org>,
Vladimir Davydov <vdavydov.dev@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Tejun Heo <tj@...nel.org>, Christoph Lameter <cl@...ux.com>,
Pekka Enberg <penberg@...nel.org>,
David Rientjes <rientjes@...gle.com>,
Joonsoo Kim <iamjoonsoo.kim@....com>,
Vlastimil Babka <vbabka@...e.cz>,
<linux-kernel@...r.kernel.org>, <cgroups@...r.kernel.org>,
<linux-mm@...ck.org>, Shakeel Butt <shakeelb@...gle.com>,
Muchun Song <songmuchun@...edance.com>,
Alex Shi <alex.shi@...ux.alibaba.com>,
Chris Down <chris@...isdown.name>,
Yafang Shao <laoar.shao@...il.com>,
Alexander Duyck <alexander.h.duyck@...ux.intel.com>,
Wei Yang <richard.weiyang@...il.com>,
Masayoshi Mizuma <msys.mizuma@...il.com>
Subject: Re: [PATCH 5/5] mm/memcg: Optimize user context object stock access
On Fri, Apr 09, 2021 at 07:18:42PM -0400, Waiman Long wrote:
> Most kmem_cache_alloc() calls are from user context. With instrumentation
> enabled, the measured amount of kmem_cache_alloc() calls from non-task
> context was about 0.01% of the total.
>
> The irq disable/enable sequence used in this case to access content
> from object stock is slow. To optimize for user context access, there
> are now two object stocks for task context and interrupt context access
> respectively.
>
> The task context object stock can be accessed after disabling preemption
> which is cheap in non-preempt kernel. The interrupt context object stock
> can only be accessed after disabling interrupt. User context code can
> access interrupt object stock, but not vice versa.
>
> The mod_objcg_state() function is also modified to make sure that memcg
> and lruvec stat updates are done with interrupted disabled.
>
> The downside of this change is that there are more data stored in local
> object stocks and not reflected in the charge counter and the vmstat
> arrays. However, this is a small price to pay for better performance.
I agree, the extra memory space is not a significant concern.
I'd be more worried about the code complexity, but the result looks
nice to me!
Acked-by: Roman Gushchin <guro@...com>
Btw, it seems that the mm tree ran a bit off, so I had to apply this series
on top of Linus's tree to review. Please, rebase.
Thanks!
Powered by blists - more mailing lists