[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b21bc991-b375-82d8-46f3-a5a9779b79c9@linux.alibaba.com>
Date: Tue, 25 Jun 2019 15:33:40 -0700
From: Yang Shi <yang.shi@...ux.alibaba.com>
To: Andrew Morton <akpm@...ux-foundation.org>
Cc: ktkhai@...tuozzo.com, kirill.shutemov@...ux.intel.com,
hannes@...xchg.org, mhocko@...e.com, hughd@...gle.com,
shakeelb@...gle.com, rientjes@...gle.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [v3 PATCH 4/4] mm: thp: make deferred split shrinker memcg aware
On 6/25/19 3:00 PM, Andrew Morton wrote:
> On Thu, 13 Jun 2019 05:56:49 +0800 Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>
>> Currently THP deferred split shrinker is not memcg aware, this may cause
>> premature OOM with some configuration. For example the below test would
>> run into premature OOM easily:
>>
>> $ cgcreate -g memory:thp
>> $ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes
>> $ cgexec -g memory:thp transhuge-stress 4000
>>
>> transhuge-stress comes from kernel selftest.
>>
>> It is easy to hit OOM, but there are still a lot THP on the deferred
>> split queue, memcg direct reclaim can't touch them since the deferred
>> split shrinker is not memcg aware.
>>
>> Convert deferred split shrinker memcg aware by introducing per memcg
>> deferred split queue. The THP should be on either per node or per memcg
>> deferred split queue if it belongs to a memcg. When the page is
>> immigrated to the other memcg, it will be immigrated to the target
>> memcg's deferred split queue too.
>>
>> Reuse the second tail page's deferred_list for per memcg list since the
>> same THP can't be on multiple deferred split queues.
>>
>> ...
>>
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -4579,6 +4579,11 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
>> #ifdef CONFIG_CGROUP_WRITEBACK
>> INIT_LIST_HEAD(&memcg->cgwb_list);
>> #endif
>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> + spin_lock_init(&memcg->deferred_split_queue.split_queue_lock);
>> + INIT_LIST_HEAD(&memcg->deferred_split_queue.split_queue);
>> + memcg->deferred_split_queue.split_queue_len = 0;
>> +#endif
>> idr_replace(&mem_cgroup_idr, memcg, memcg->id.id);
>> return memcg;
>> fail:
>> @@ -4949,6 +4954,14 @@ static int mem_cgroup_move_account(struct page *page,
>> __mod_memcg_state(to, NR_WRITEBACK, nr_pages);
>> }
>>
>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> + if (compound && !list_empty(page_deferred_list(page))) {
>> + spin_lock(&from->deferred_split_queue.split_queue_lock);
>> + list_del(page_deferred_list(page));
> It's worrisome that this page still appears to be on the deferred_list
> and that the above if() would still succeed. Should this be
> list_del_init()?
list_del_init() sounds safe although I'm not quite sure this is
possible. Will update this with fixing build issue together.
>
>> + from->deferred_split_queue.split_queue_len--;
>> + spin_unlock(&from->deferred_split_queue.split_queue_lock);
>> + }
>> +#endif
Powered by blists - more mailing lists