[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <514AFBD4.2050201@linux.vnet.ibm.com>
Date: Thu, 21 Mar 2013 08:23:48 -0400
From: "Michael R. Hines" <mrhines@...ux.vnet.ibm.com>
To: "Michael S. Tsirkin" <mst@...hat.com>
CC: Roland Dreier <roland@...nel.org>,
Sean Hefty <sean.hefty@...el.com>,
Hal Rosenstock <hal.rosenstock@...il.com>,
Yishai Hadas <yishaih@...lanox.com>,
Christoph Lameter <cl@...ux.com>, linux-rdma@...r.kernel.org,
linux-kernel@...r.kernel.org, qemu-devel@...gnu.org
Subject: Re: [PATCH] rdma: don't make pages writeable if not requiested
Yes, I'd be happy to try the patch.
Got meetings all day...... but will dive in soon.
On 03/21/2013 02:18 AM, Michael S. Tsirkin wrote:
> core/umem.c seems to get the arguments to get_user_pages
> in the reverse order: it sets writeable flag and
> breaks COW for MAP_SHARED if and only if hardware needs to
> write the page.
>
> This breaks memory overcommit for users such as KVM:
> each time we try to register a page to send it to remote, this
> breaks COW. It seems that for applications that only have
> REMOTE_READ permission, there is no reason to break COW at all.
>
> If the page that is COW has lots of copies, this makes the user process
> quickly exceed the cgroups memory limit. This makes RDMA mostly useless
> for virtualization, thus the stable tag.
>
> Reported-by: "Michael R. Hines" <mrhines@...ux.vnet.ibm.com>
> Cc: stable@...r.kernel.org
> Signed-off-by: Michael S. Tsirkin <mst@...hat.com>
> ---
>
> Note: compile-tested only, I don't have RDMA hardware at the moment.
> Michael, could you please try this patch (also fixing your
> usespace code not to request write access) and report?
>
> Note2: grep for get_user_pages in infiniband drivers turns up
> lots of users who set write to 1 unconditionally.
> These might be bugs too, should be checked.
>
> drivers/infiniband/core/umem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
> index a841123..5929598 100644
> --- a/drivers/infiniband/core/umem.c
> +++ b/drivers/infiniband/core/umem.c
> @@ -152,7 +152,7 @@ struct ib_umem *ib_umem_get(struct ib_ucontext *context, unsigned long addr,
> ret = get_user_pages(current, current->mm, cur_base,
> min_t(unsigned long, npages,
> PAGE_SIZE / sizeof (struct page *)),
> - 1, !umem->writable, page_list, vma_list);
> + !umem->writable, 1, page_list, vma_list);
>
> if (ret < 0)
> goto out;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists