linux-kernel - Re: [PATCH] mm: Optimise nth

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <67922c5b-0a7b-4eab-9fee-455acf555ebf@redhat.com>
Date:   Wed, 14 Apr 2021 17:24:42 +0200
From:   David Hildenbrand <david@...hat.com>
To:     "Matthew Wilcox (Oracle)" <willy@...radead.org>,
        Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
        linux-kernel@...r.kernel.org, Tejun Heo <tj@...nel.org>,
        FUJITA Tomonori <fujita.tomonori@....ntt.co.jp>,
        Douglas Gilbert <dougg@...que.net>,
        Chris Wilson <chris@...is-wilson.co.uk>
Cc:     Christoph Hellwig <hch@....de>
Subject: Re: [PATCH] mm: Optimise nth_page for contiguous memmap

On 13.04.21 21:46, Matthew Wilcox (Oracle) wrote:
> If the memmap is virtually contiguous (either because we're using
> a virtually mapped memmap or because we don't support a discontig
> memmap at all), then we can implement nth_page() by simple addition.
> Contrary to popular belief, the compiler is not able to optimise this
> itself for a vmemmap configuration.  This reduces one example user (sg.c)
> by four instructions:
> 
>          struct page *page = nth_page(rsv_schp->pages[k], offset >> PAGE_SHIFT);
> 
> before:
>     49 8b 45 70             mov    0x70(%r13),%rax
>     48 63 c9                movslq %ecx,%rcx
>     48 c1 eb 0c             shr    $0xc,%rbx
>     48 8b 04 c8             mov    (%rax,%rcx,8),%rax
>     48 2b 05 00 00 00 00    sub    0x0(%rip),%rax
>             R_X86_64_PC32      vmemmap_base-0x4
>     48 c1 f8 06             sar    $0x6,%rax
>     48 01 d8                add    %rbx,%rax
>     48 c1 e0 06             shl    $0x6,%rax
>     48 03 05 00 00 00 00    add    0x0(%rip),%rax
>             R_X86_64_PC32      vmemmap_base-0x4
> 
> after:
>     49 8b 45 70             mov    0x70(%r13),%rax
>     48 63 c9                movslq %ecx,%rcx
>     48 c1 eb 0c             shr    $0xc,%rbx
>     48 c1 e3 06             shl    $0x6,%rbx
>     48 03 1c c8             add    (%rax,%rcx,8),%rbx
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@...radead.org>
> Reviewed-by: Christoph Hellwig <hch@....de>
> ---
>   include/linux/mm.h | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 25b9041f9925..2327f99b121f 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -234,7 +234,11 @@ int overcommit_policy_handler(struct ctl_table *, int, void *, size_t *,
>   int __add_to_page_cache_locked(struct page *page, struct address_space *mapping,
>   		pgoff_t index, gfp_t gfp, void **shadowp);
>   
> +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
>   #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
> +#else
> +#define nth_page(page,n) ((page) + (n))
> +#endif

For sparsemem we could optimize within a single memory section. But not 
sure if it's worth the trouble.

Reviewed-by: David Hildenbrand <david@...hat.com>

-- 
Thanks,

David / dhildenb