linux-kernel - RE: [PATCH 23/26] ib: Convert qib_get_user_pages() to get_user_pages

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <32E1700B9017364D9B60AED9960492BC211AEF75@FMSMSX107.amr.corp.intel.com>
Date:	Wed, 2 Oct 2013 14:54:59 +0000
From:	"Marciniszyn, Mike" <mike.marciniszyn@...el.com>
To:	Jan Kara <jack@...e.cz>, LKML <linux-kernel@...r.kernel.org>
CC:	"linux-mm@...ck.org" <linux-mm@...ck.org>,
	infinipath <infinipath@...el.com>,
	Roland Dreier <roland@...nel.org>,
	"linux-rdma@...r.kernel.org" <linux-rdma@...r.kernel.org>
Subject: RE: [PATCH 23/26] ib: Convert qib_get_user_pages() to
 get_user_pages_unlocked()

Thanks!!

I would like to test these two patches and also do the stable work for the deadlock as well.  Do you have these patches in a repo somewhere to save me a bit of work?

We had been working on an internal version of the deadlock portion of this patch that uses get_user_pages_fast() vs. the new get_user_pages_unlocked().

The risk of GUP fast is the loss of the "force" arg on GUP fast, which I don't see as significant give our use case.

Some nits on the subject and commit message:
1. use IB/qib, IB/ipath vs. ib
2. use the correct ipath vs. qib in the commit message text

Mike

> -----Original Message-----
> From: Jan Kara [mailto:jack@...e.cz]
> Sent: Wednesday, October 02, 2013 10:28 AM
> To: LKML
> Cc: linux-mm@...ck.org; Jan Kara; infinipath; Roland Dreier; linux-
> rdma@...r.kernel.org
> Subject: [PATCH 23/26] ib: Convert qib_get_user_pages() to
> get_user_pages_unlocked()
> 
> Convert qib_get_user_pages() to use get_user_pages_unlocked().  This
> shortens the section where we hold mmap_sem for writing and also removes
> the knowledge about get_user_pages() locking from ipath driver. We also fix
> a bug in testing pinned number of pages when changing the code.
> 
> CC: Mike Marciniszyn <infinipath@...el.com>
> CC: Roland Dreier <roland@...nel.org>
> CC: linux-rdma@...r.kernel.org
> Signed-off-by: Jan Kara <jack@...e.cz>
> ---
>  drivers/infiniband/hw/qib/qib_user_pages.c | 62 +++++++++++++----------------
> -
>  1 file changed, 26 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/qib/qib_user_pages.c
> b/drivers/infiniband/hw/qib/qib_user_pages.c
> index 2bc1d2b96298..57ce83c2d1d9 100644
> --- a/drivers/infiniband/hw/qib/qib_user_pages.c
> +++ b/drivers/infiniband/hw/qib/qib_user_pages.c
> @@ -48,39 +48,55 @@ static void __qib_release_user_pages(struct page
> **p, size_t num_pages,
>  	}
>  }
> 
> -/*
> - * Call with current->mm->mmap_sem held.
> +/**
> + * qib_get_user_pages - lock user pages into memory
> + * @start_page: the start page
> + * @num_pages: the number of pages
> + * @p: the output page structures
> + *
> + * This function takes a given start page (page aligned user virtual
> + * address) and pins it and the following specified number of pages.
> +For
> + * now, num_pages is always 1, but that will probably change at some
> +point
> + * (because caller is doing expected sends on a single virtually
> +contiguous
> + * buffer, so we can do all pages at once).
>   */
> -static int __qib_get_user_pages(unsigned long start_page, size_t
> num_pages,
> -				struct page **p, struct vm_area_struct
> **vma)
> +int qib_get_user_pages(unsigned long start_page, size_t num_pages,
> +		       struct page **p)
>  {
>  	unsigned long lock_limit;
>  	size_t got;
>  	int ret;
> +	struct mm_struct *mm = current->mm;
> 
> +	down_write(&mm->mmap_sem);
>  	lock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
> 
> -	if (num_pages > lock_limit && !capable(CAP_IPC_LOCK)) {
> +	if (mm->pinned_vm + num_pages > lock_limit &&
> !capable(CAP_IPC_LOCK)) {
> +		up_write(&mm->mmap_sem);
>  		ret = -ENOMEM;
>  		goto bail;
>  	}
> +	mm->pinned_vm += num_pages;
> +	up_write(&mm->mmap_sem);
> 
>  	for (got = 0; got < num_pages; got += ret) {
> -		ret = get_user_pages(current, current->mm,
> -				     start_page + got * PAGE_SIZE,
> -				     num_pages - got, 1, 1,
> -				     p + got, vma);
> +		ret = get_user_pages_unlocked(current, mm,
> +					      start_page + got * PAGE_SIZE,
> +					      num_pages - got, 1, 1,
> +					      p + got);
>  		if (ret < 0)
>  			goto bail_release;
>  	}
> 
> -	current->mm->pinned_vm += num_pages;
> 
>  	ret = 0;
>  	goto bail;
> 
>  bail_release:
>  	__qib_release_user_pages(p, got, 0);
> +	down_write(&mm->mmap_sem);
> +	mm->pinned_vm -= num_pages;
> +	up_write(&mm->mmap_sem);
>  bail:
>  	return ret;
>  }
> @@ -117,32 +133,6 @@ dma_addr_t qib_map_page(struct pci_dev *hwdev,
> struct page *page,
>  	return phys;
>  }
> 
> -/**
> - * qib_get_user_pages - lock user pages into memory
> - * @start_page: the start page
> - * @num_pages: the number of pages
> - * @p: the output page structures
> - *
> - * This function takes a given start page (page aligned user virtual
> - * address) and pins it and the following specified number of pages.  For
> - * now, num_pages is always 1, but that will probably change at some point
> - * (because caller is doing expected sends on a single virtually contiguous
> - * buffer, so we can do all pages at once).
> - */
> -int qib_get_user_pages(unsigned long start_page, size_t num_pages,
> -		       struct page **p)
> -{
> -	int ret;
> -
> -	down_write(&current->mm->mmap_sem);
> -
> -	ret = __qib_get_user_pages(start_page, num_pages, p, NULL);
> -
> -	up_write(&current->mm->mmap_sem);
> -
> -	return ret;
> -}
> -
>  void qib_release_user_pages(struct page **p, size_t num_pages)  {
>  	if (current->mm) /* during close after signal, mm can be NULL */
> --
> 1.8.1.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/