[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d2883450-1278-877e-e273-bda5a5728465@loongson.cn>
Date: Fri, 11 Mar 2022 17:51:54 +0800
From: maobibo <maobibo@...ngson.cn>
To: David Hildenbrand <david@...hat.com>,
Andrew Morton <akpm@...ux-foundation.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/khugepaged: sched to numa node when collapse huge page
On 03/11/2022 05:20 PM, David Hildenbrand wrote:
> On 11.03.22 10:01, Bibo Mao wrote:
>> collapse huge page is slow, specially when khugepaged daemon runs
>> on different numa node with that of huge page. It suffers from
>> huge page copying across nodes, also cache is not used for target
>> node. With this patch, khugepaged daemon switches to the same numa
>> node with huge page. It saves copying time and makes use of local
>> cache better.
>
> Hi,
>
> just the usual question, do you have any performance numbers to back
> your claims (e.g., "is slow, specially when") and proof that this patch
> does the trick?
With specint 2006 on loongarch 3C5000L 32core numa system, it improves
about 6%. The page size is 16K and pmd page size is 32M, memory performance
across numa node is obvious different. However I do not test it on x86 box.
>
>
>>
>> Signed-off-by: Bibo Mao <maobibo@...ngson.cn>
>> ---
>> mm/khugepaged.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 131492fd1148..460c285dc974 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -116,6 +116,7 @@ struct khugepaged_scan {
>> struct list_head mm_head;
>> struct mm_slot *mm_slot;
>> unsigned long address;
>> + int node;
>> };
>>
>> static struct khugepaged_scan khugepaged_scan = {
>> @@ -1066,6 +1067,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>> struct vm_area_struct *vma;
>> struct mmu_notifier_range range;
>> gfp_t gfp;
>> + const struct cpumask *cpumask;
>
> We tend to stick to reverse Christmas tree format as good as possible.
>
>>
>> VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>>
>> @@ -1079,6 +1081,13 @@ static void collapse_huge_page(struct mm_struct *mm,
>> * that. We will recheck the vma after taking it again in write mode.
>> */
>> mmap_read_unlock(mm);
>> +
>> + /* sched to specified node before huage page memory copy */
>
> s/huage/huge/
>
>> + cpumask = cpumask_of_node(node);
>> + if ((khugepaged_scan.node != node) && !cpumask_empty(cpumask)) {
>> + set_cpus_allowed_ptr(current, cpumask);
>> + khugepaged_scan.node = node;
>> + }
>> new_page = khugepaged_alloc_page(hpage, gfp, node);
>> if (!new_page) {
>> result = SCAN_ALLOC_HUGE_PAGE_FAIL;
>> @@ -2380,6 +2389,7 @@ int start_stop_khugepaged(void)
>> kthread_stop(khugepaged_thread);
>> khugepaged_thread = NULL;
>> }
>> + khugepaged_scan.node = NUMA_NO_NODE;
>> set_recommended_min_free_kbytes();
>> fail:
>> mutex_unlock(&khugepaged_mutex);
>
>
Powered by blists - more mailing lists