lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAHbLzkoiP+uRYGDm+FC_zg-LkPbTMFQ-wSzGMh0RPr-XP4_ciw@mail.gmail.com>
Date:   Fri, 11 Mar 2022 11:08:32 -0800
From:   Yang Shi <shy828301@...il.com>
To:     Bibo Mao <maobibo@...ngson.cn>
Cc:     Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm/khugepaged: sched to numa node when collapse huge page

On Fri, Mar 11, 2022 at 1:01 AM Bibo Mao <maobibo@...ngson.cn> wrote:
>
> collapse huge page is slow, specially when khugepaged daemon runs
> on different numa node with that of huge page. It suffers from
> huge page copying across nodes, also cache is not used for target
> node. With this patch, khugepaged daemon switches to the same numa
> node with huge page. It saves copying time and makes use of local
> cache better.
>
> Signed-off-by: Bibo Mao <maobibo@...ngson.cn>
> ---
>  mm/khugepaged.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 131492fd1148..460c285dc974 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -116,6 +116,7 @@ struct khugepaged_scan {
>         struct list_head mm_head;
>         struct mm_slot *mm_slot;
>         unsigned long address;
> +       int node;
>  };
>
>  static struct khugepaged_scan khugepaged_scan = {
> @@ -1066,6 +1067,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>         struct vm_area_struct *vma;
>         struct mmu_notifier_range range;
>         gfp_t gfp;
> +       const struct cpumask *cpumask;
>
>         VM_BUG_ON(address & ~HPAGE_PMD_MASK);
>
> @@ -1079,6 +1081,13 @@ static void collapse_huge_page(struct mm_struct *mm,
>          * that. We will recheck the vma after taking it again in write mode.
>          */
>         mmap_read_unlock(mm);
> +
> +       /* sched to specified node before huage page memory copy */
> +       cpumask = cpumask_of_node(node);
> +       if ((khugepaged_scan.node != node) && !cpumask_empty(cpumask)) {
> +               set_cpus_allowed_ptr(current, cpumask);
> +               khugepaged_scan.node = node;

What if khugepaged was scheduled to the other nodes after this, but
khugepaged_scan.node still equals to node? It seems possible to me
IIUC.

TBH I'm not quite sure if migrating khugepaged is really worth it for
everyone or not. The worst case is the locality of base pages are not
obvious, for example, the base pages may be across all nodes, so you
always get cross nodes memory copy. And khugepaged may get slower if
cpu is contentious.

In addition, I saw MIPS has its own copy_user_highpage(), is it a
contributing factor too?

> +       }
>         new_page = khugepaged_alloc_page(hpage, gfp, node);
>         if (!new_page) {
>                 result = SCAN_ALLOC_HUGE_PAGE_FAIL;
> @@ -2380,6 +2389,7 @@ int start_stop_khugepaged(void)
>                 kthread_stop(khugepaged_thread);
>                 khugepaged_thread = NULL;
>         }
> +       khugepaged_scan.node = NUMA_NO_NODE;
>         set_recommended_min_free_kbytes();
>  fail:
>         mutex_unlock(&khugepaged_mutex);
> --
> 2.31.1
>
>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ