linux-kernel - Re: [PATCH v2 7/9] mm: vmalloc: Support multiple nodes in vread

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ZP6QVTQmDGx7tx1a@MiWiFi-R3L-srv>
Date:   Mon, 11 Sep 2023 11:58:13 +0800
From:   Baoquan He <bhe@...hat.com>
To:     "Uladzislau Rezki (Sony)" <urezki@...il.com>
Cc:     linux-mm@...ck.org, Andrew Morton <akpm@...ux-foundation.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Lorenzo Stoakes <lstoakes@...il.com>,
        Christoph Hellwig <hch@...radead.org>,
        Matthew Wilcox <willy@...radead.org>,
        "Liam R . Howlett" <Liam.Howlett@...cle.com>,
        Dave Chinner <david@...morbit.com>,
        "Paul E . McKenney" <paulmck@...nel.org>,
        Joel Fernandes <joel@...lfernandes.org>,
        Oleksiy Avramchenko <oleksiy.avramchenko@...y.com>
Subject: Re: [PATCH v2 7/9] mm: vmalloc: Support multiple nodes in vread_iter

On 08/29/23 at 10:11am, Uladzislau Rezki (Sony) wrote:
> Extend the vread_iter() to be able to perform a sequential
> reading of VAs which are spread among multiple nodes. So a
> data read over the /dev/kmem correctly reflects a vmalloc
> memory layout.
> 
> Signed-off-by: Uladzislau Rezki (Sony) <urezki@...il.com>
> ---
>  mm/vmalloc.c | 67 +++++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 53 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 4fd4915c532d..968144c16237 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
......  
> @@ -4057,19 +4093,15 @@ long vread_iter(struct iov_iter *iter, const char *addr, size_t count)
>  
>  	remains = count;
>  
> -	/* Hooked to node_0 so far. */
> -	vn = addr_to_node(0);
> -	spin_lock(&vn->busy.lock);

This could change the vread behaviour a little bit. Before, once we take
vmap_area_lock, the vread will read out the content of snapshot at the
moment. Now, reading out in one node's tree won't disrupt other nodes'
tree accessing. Not sure if this matters when people need access
/proc/kcore, e.g dynamic debugging.

And, the reading will be a little slower because each va finding need
iterate all vmap_nodes[].

Otherwise, the patch itself looks good to me.

Reviewed-by: Baoquan He <bhe@...hat.com>

> -
> -	va = find_vmap_area_exceed_addr((unsigned long)addr, &vn->busy.root);
> -	if (!va)
> +	vn = find_vmap_area_exceed_addr_lock((unsigned long) addr, &va);
> +	if (!vn)
>  		goto finished_zero;
>  
>  	/* no intersects with alive vmap_area */
>  	if ((unsigned long)addr + remains <= va->va_start)
>  		goto finished_zero;
>  
> -	list_for_each_entry_from(va, &vn->busy.head, list) {
> +	do {
>  		size_t copied;
>  
>  		if (remains == 0)
> @@ -4084,10 +4116,10 @@ long vread_iter(struct iov_iter *iter, const char *addr, size_t count)
>  		WARN_ON(flags == VMAP_BLOCK);
>  
>  		if (!vm && !flags)
> -			continue;
> +			goto next_va;
>  
>  		if (vm && (vm->flags & VM_UNINITIALIZED))
> -			continue;
> +			goto next_va;
>  
>  		/* Pair with smp_wmb() in clear_vm_uninitialized_flag() */
>  		smp_rmb();
> @@ -4096,7 +4128,7 @@ long vread_iter(struct iov_iter *iter, const char *addr, size_t count)
>  		size = vm ? get_vm_area_size(vm) : va_size(va);
>  
>  		if (addr >= vaddr + size)
> -			continue;
> +			goto next_va;
>  
>  		if (addr < vaddr) {
>  			size_t to_zero = min_t(size_t, vaddr - addr, remains);
> @@ -4125,15 +4157,22 @@ long vread_iter(struct iov_iter *iter, const char *addr, size_t count)
>  
>  		if (copied != n)
>  			goto finished;
> -	}
> +
> +	next_va:
> +		next = va->va_end;
> +		spin_unlock(&vn->busy.lock);
> +	} while ((vn = find_vmap_area_exceed_addr_lock(next, &va)));
>  
>  finished_zero:
> -	spin_unlock(&vn->busy.lock);
> +	if (vn)
> +		spin_unlock(&vn->busy.lock);
> +
>  	/* zero-fill memory holes */
>  	return count - remains + zero_iter(iter, remains);
>  finished:
>  	/* Nothing remains, or We couldn't copy/zero everything. */
> -	spin_unlock(&vn->busy.lock);
> +	if (vn)
> +		spin_unlock(&vn->busy.lock);
>  
>  	return count - remains;
>  }
> -- 
> 2.30.2
>