[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZGM0gegQkvrQtq49@kernel.org>
Date:   Tue, 16 May 2023 10:45:05 +0300
From:   Mike Rapoport <rppt@...nel.org>
To:     Yuanchu Xie <yuanchu@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Yang Shi <shy828301@...il.com>,
        Zach O'Keefe <zokeefe@...gle.com>,
        Peter Xu <peterx@...hat.com>,
        "Kirill A . Shutemov" <kirill.shutemov@...ux.intel.com>,
        Matthew Wilcox <willy@...radead.org>,
        Pasha Tatashin <pasha.tatashin@...een.com>,
        linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [PATCH] mm: pagemap: restrict pagewalk to the requested range
On Tue, May 16, 2023 at 01:26:08AM +0800, Yuanchu Xie wrote:
> The pagewalk in pagemap_read reads one PTE past the end of the requested
> range, and stops when the buffer runs out of space. While it produces
> the right result, the extra read is unnecessary and less performant.
> 
> I timed the following command before and after this patch:
> 	dd count=100000 if=/proc/self/pagemap of=/dev/null
> The results are consistently within 0.001s across 5 runs.
> 
> Before:
> 100000+0 records in
> 100000+0 records out
> 51200000 bytes (51 MB) copied, 0.0763159 s, 671 MB/s
> 
> real    0m0.078s
> user    0m0.012s
> sys     0m0.065s
> 
> After:
> 100000+0 records in
> 100000+0 records out
> 51200000 bytes (51 MB) copied, 0.0487928 s, 1.0 GB/s
> 
> real    0m0.050s
> user    0m0.011s
> sys     0m0.039s
> 
> Signed-off-by: Yuanchu Xie <yuanchu@...gle.com>
Acked-by: Mike Rapoport (IBM) <rppt@...nel.org>
> ---
>  fs/proc/task_mmu.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 420510f6a545..6259dd432eeb 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1689,23 +1689,23 @@ static ssize_t pagemap_read(struct file *file, char __user *buf,
>  	/* watch out for wraparound */
>  	start_vaddr = end_vaddr;
>  	if (svpfn <= (ULONG_MAX >> PAGE_SHIFT)) {
> +		unsigned long end;
> +
>  		ret = mmap_read_lock_killable(mm);
>  		if (ret)
>  			goto out_free;
>  		start_vaddr = untagged_addr_remote(mm, svpfn << PAGE_SHIFT);
>  		mmap_read_unlock(mm);
> +
> +		end = start_vaddr + ((count / PM_ENTRY_BYTES) << PAGE_SHIFT);
> +		if (end >= start_vaddr && end < mm->task_size)
> +			end_vaddr = end;
>  	}
>  
>  	/* Ensure the address is inside the task */
>  	if (start_vaddr > mm->task_size)
>  		start_vaddr = end_vaddr;
>  
> -	/*
> -	 * The odds are that this will stop walking way
> -	 * before end_vaddr, because the length of the
> -	 * user buffer is tracked in "pm", and the walk
> -	 * will stop when we hit the end of the buffer.
> -	 */
>  	ret = 0;
>  	while (count && (start_vaddr < end_vaddr)) {
>  		int len;
> -- 
> 2.40.1.606.ga4b1b128d6-goog
> 
> 
-- 
Sincerely yours,
Mike.
Powered by blists - more mailing lists
 
