[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20190625150040.feb6ea9d11fff73a57320a3c@linux-foundation.org>
Date: Tue, 25 Jun 2019 15:00:40 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Yang Shi <yang.shi@...ux.alibaba.com>
Cc: ktkhai@...tuozzo.com, kirill.shutemov@...ux.intel.com,
hannes@...xchg.org, mhocko@...e.com, hughd@...gle.com,
shakeelb@...gle.com, rientjes@...gle.com, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [v3 PATCH 4/4] mm: thp: make deferred split shrinker memcg
aware
On Thu, 13 Jun 2019 05:56:49 +0800 Yang Shi <yang.shi@...ux.alibaba.com> wrote:
> Currently THP deferred split shrinker is not memcg aware, this may cause
> premature OOM with some configuration. For example the below test would
> run into premature OOM easily:
>
> $ cgcreate -g memory:thp
> $ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes
> $ cgexec -g memory:thp transhuge-stress 4000
>
> transhuge-stress comes from kernel selftest.
>
> It is easy to hit OOM, but there are still a lot THP on the deferred
> split queue, memcg direct reclaim can't touch them since the deferred
> split shrinker is not memcg aware.
>
> Convert deferred split shrinker memcg aware by introducing per memcg
> deferred split queue. The THP should be on either per node or per memcg
> deferred split queue if it belongs to a memcg. When the page is
> immigrated to the other memcg, it will be immigrated to the target
> memcg's deferred split queue too.
>
> Reuse the second tail page's deferred_list for per memcg list since the
> same THP can't be on multiple deferred split queues.
>
> ...
>
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -4579,6 +4579,11 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
> #ifdef CONFIG_CGROUP_WRITEBACK
> INIT_LIST_HEAD(&memcg->cgwb_list);
> #endif
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + spin_lock_init(&memcg->deferred_split_queue.split_queue_lock);
> + INIT_LIST_HEAD(&memcg->deferred_split_queue.split_queue);
> + memcg->deferred_split_queue.split_queue_len = 0;
> +#endif
> idr_replace(&mem_cgroup_idr, memcg, memcg->id.id);
> return memcg;
> fail:
> @@ -4949,6 +4954,14 @@ static int mem_cgroup_move_account(struct page *page,
> __mod_memcg_state(to, NR_WRITEBACK, nr_pages);
> }
>
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> + if (compound && !list_empty(page_deferred_list(page))) {
> + spin_lock(&from->deferred_split_queue.split_queue_lock);
> + list_del(page_deferred_list(page));
It's worrisome that this page still appears to be on the deferred_list
and that the above if() would still succeed. Should this be
list_del_init()?
> + from->deferred_split_queue.split_queue_len--;
> + spin_unlock(&from->deferred_split_queue.split_queue_lock);
> + }
> +#endif
Powered by blists - more mailing lists