lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b550e766-b0fd-2c99-c82f-80e770e8a496@oracle.com>
Date:   Thu, 28 Jan 2021 13:53:15 -0800
From:   Mike Kravetz <mike.kravetz@...cle.com>
To:     Joao Martins <joao.m.martins@...cle.com>, linux-mm@...ck.org
Cc:     linux-kernel@...r.kernel.org,
        Andrew Morton <akpm@...ux-foundation.org>
Subject: Re: [PATCH v2 2/2] mm/hugetlb: refactor subpage recording

On 1/28/21 10:26 AM, Joao Martins wrote:
> For a given hugepage backing a VA, there's a rather ineficient
> loop which is solely responsible for storing subpages in GUP
> @pages/@...s array. For each subpage we check whether it's within
> range or size of @pages and keep increment @pfn_offset and a couple
> other variables per subpage iteration.
> 
> Simplify this logic and minimize the cost of each iteration to just
> store the output page/vma. Instead of incrementing number of @refs
> iteratively, we do it through pre-calculation of @refs and only
> with a tight loop for storing pinned subpages/vmas.
> 
> Additionally, retain existing behaviour with using mem_map_offset()
> when recording the subpages for configurations that don't have a
> contiguous mem_map.
> 
> pinning consequently improves bringing us close to
> {pin,get}_user_pages_fast:
> 
>   - 16G with 1G huge page size
>   gup_test -f /mnt/huge/file -m 16384 -r 30 -L -S -n 512 -w
> 
> PIN_LONGTERM_BENCHMARK: ~12.8k us -> ~5.8k us
> PIN_FAST_BENCHMARK: ~3.7k us
> 
> Signed-off-by: Joao Martins <joao.m.martins@...cle.com>
> ---
>  mm/hugetlb.c | 49 ++++++++++++++++++++++++++++---------------------
>  1 file changed, 28 insertions(+), 21 deletions(-)

Thanks for updating this.

Reviewed-by: Mike Kravetz <mike.kravetz@...cle.com>

I think there still is an open general question about whether we can always
assume page structs are contiguous for really big pages.  That is outside
the scope of this patch.  Adding the mem_map_offset() keeps this consistent
with other hugetlbfs specific code.

-- 
Mike Kravetz

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ