lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <b21bc991-b375-82d8-46f3-a5a9779b79c9@linux.alibaba.com>
Date:   Tue, 25 Jun 2019 15:33:40 -0700
From:   Yang Shi <yang.shi@...ux.alibaba.com>
To:     Andrew Morton <akpm@...ux-foundation.org>
Cc:     ktkhai@...tuozzo.com, kirill.shutemov@...ux.intel.com,
        hannes@...xchg.org, mhocko@...e.com, hughd@...gle.com,
        shakeelb@...gle.com, rientjes@...gle.com, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [v3 PATCH 4/4] mm: thp: make deferred split shrinker memcg aware



On 6/25/19 3:00 PM, Andrew Morton wrote:
> On Thu, 13 Jun 2019 05:56:49 +0800 Yang Shi <yang.shi@...ux.alibaba.com> wrote:
>
>> Currently THP deferred split shrinker is not memcg aware, this may cause
>> premature OOM with some configuration. For example the below test would
>> run into premature OOM easily:
>>
>> $ cgcreate -g memory:thp
>> $ echo 4G > /sys/fs/cgroup/memory/thp/memory/limit_in_bytes
>> $ cgexec -g memory:thp transhuge-stress 4000
>>
>> transhuge-stress comes from kernel selftest.
>>
>> It is easy to hit OOM, but there are still a lot THP on the deferred
>> split queue, memcg direct reclaim can't touch them since the deferred
>> split shrinker is not memcg aware.
>>
>> Convert deferred split shrinker memcg aware by introducing per memcg
>> deferred split queue.  The THP should be on either per node or per memcg
>> deferred split queue if it belongs to a memcg.  When the page is
>> immigrated to the other memcg, it will be immigrated to the target
>> memcg's deferred split queue too.
>>
>> Reuse the second tail page's deferred_list for per memcg list since the
>> same THP can't be on multiple deferred split queues.
>>
>> ...
>>
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -4579,6 +4579,11 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
>>   #ifdef CONFIG_CGROUP_WRITEBACK
>>   	INIT_LIST_HEAD(&memcg->cgwb_list);
>>   #endif
>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> +	spin_lock_init(&memcg->deferred_split_queue.split_queue_lock);
>> +	INIT_LIST_HEAD(&memcg->deferred_split_queue.split_queue);
>> +	memcg->deferred_split_queue.split_queue_len = 0;
>> +#endif
>>   	idr_replace(&mem_cgroup_idr, memcg, memcg->id.id);
>>   	return memcg;
>>   fail:
>> @@ -4949,6 +4954,14 @@ static int mem_cgroup_move_account(struct page *page,
>>   		__mod_memcg_state(to, NR_WRITEBACK, nr_pages);
>>   	}
>>   
>> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
>> +	if (compound && !list_empty(page_deferred_list(page))) {
>> +		spin_lock(&from->deferred_split_queue.split_queue_lock);
>> +		list_del(page_deferred_list(page));
> It's worrisome that this page still appears to be on the deferred_list
> and that the above if() would still succeed.  Should this be
> list_del_init()?

list_del_init() sounds safe although I'm not quite sure this is 
possible. Will update this with fixing build issue together.

>
>> +		from->deferred_split_queue.split_queue_len--;
>> +		spin_unlock(&from->deferred_split_queue.split_queue_lock);
>> +	}
>> +#endif

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ