linux-kernel - Re: [PATCH mm-unstable v2] mm/hugetlb_vmemmap: batch HVO work when demoting

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20240813140007.2459882ce674b45ecf1403f7@linux-foundation.org>
Date: Tue, 13 Aug 2024 14:00:07 -0700
From: Andrew Morton <akpm@...ux-foundation.org>
To: Yu Zhao <yuzhao@...gle.com>
Cc: Muchun Song <muchun.song@...ux.dev>, linux-mm@...ck.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH mm-unstable v2] mm/hugetlb_vmemmap: batch HVO work when
 demoting

On Mon, 12 Aug 2024 16:48:23 -0600 Yu Zhao <yuzhao@...gle.com> wrote:

> Batch the HVO work, including de-HVO of the source and HVO of the
> destination hugeTLB folios, to speed up demotion.
> 
> After commit bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with
> speculative PFN walkers"), each request of HVO or de-HVO, batched or
> not, invokes synchronize_rcu() once. For example, when not batched,
> demoting one 1GB hugeTLB folio to 512 2MB hugeTLB folios invokes
> synchronize_rcu() 513 times (1 de-HVO plus 512 HVO requests), whereas
> when batched, only twice (1 de-HVO plus 1 HVO request). And the
> performance difference between the two cases is significant, e.g.,
>   echo 2048kB >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size
>   time echo 100 >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote
> 
> Before this patch:
>   real     8m58.158s
>   user     0m0.009s
>   sys      0m5.900s
> 
> After this patch:
>   real     0m0.900s
>   user     0m0.000s
>   sys      0m0.851s

That's a large change.  I assume the now-fixed regression was of
similar magnitude?

> Note that this patch changes the behavior of the `demote` interface
> when de-HVO fails. Before, the interface aborts immediately upon
> failure; now, it tries to finish an entire batch, meaning it can make
> extra progress if the rest of the batch contains folios that do not
> need to de-HVO.
> 
> Fixes: bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")

Do we think we should add this to 6.10.x?  I do.