lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <1374772906-21511-3-git-send-email-rcj@linux.vnet.ibm.com>
Date:	Thu, 25 Jul 2013 12:21:46 -0500
From:	Robert Jennings <rcj@...ux.vnet.ibm.com>
To:	linux-kernel@...r.kernel.org
Cc:	linux-fsdevel@...r.kernel.org, linux-mm@...ck.org,
	Alexander Viro <viro@...iv.linux.org.uk>,
	Rik van Riel <riel@...hat.com>,
	Andrea Arcangeli <aarcange@...hat.com>,
	Dave Hansen <dave@...1.net>,
	Robert Jennings <rcj@...ux.vnet.ibm.com>,
	Matt Helsley <matt.helsley@...il.com>,
	Anthony Liguori <aliguori@...ibm.com>,
	Michael Roth <mdroth@...ux.vnet.ibm.com>,
	Lei Li <lilei@...ux.vnet.ibm.com>,
	Leonardo Garcia <lagarcia@...ux.vnet.ibm.com>
Subject: [RFC PATCH 2/2] Add limited zero copy to vmsplice

From: Matt Helsley <matthltc@...ibm.com>

It is sometimes useful to move anonymous pages over a pipe rather than
save/swap them. Check the SPLICE_F_GIFT and SPLICE_F_MOVE flags to see
if userspace would like to move such pages. This differs from plain
SPLICE_F_GIFT in that the memory written to the pipe will no longer
have the same contents as the original -- it effectively faults in new,
empty anonymous pages.

On the read side the page written to the pipe will be copied unless
SPLICE_F_MOVE is used. Otherwise copying will be performed and the page
will be reclaimed. Note that so long as there is a mapping to the page
copies will be done instead because rmap will have upped the map count for
each anonymous mapping; this can happen do to fork(), for example. This
is necessary because moving the page will usually change the anonymous
page's nonlinear index and that can only be done if it's unmapped.

Signed-off-by: Matt Helsley <matthltc@...ibm.com>
Signed-off-by: Matt Helsley <matt.helsley@...il.com>
Signed-off-by: Robert Jennings <rcj@...ux.vnet.ibm.com>
---
 fs/splice.c | 63 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 63 insertions(+)

diff --git a/fs/splice.c b/fs/splice.c
index 6aa964f..0a715c3 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -32,6 +32,10 @@
 #include <linux/gfp.h>
 #include <linux/socket.h>
 #include <linux/compat.h>
+#include <linux/page-flags.h>
+#include <linux/hugetlb.h>
+#include <linux/ksm.h>
+#include <linux/swapops.h>
 #include "internal.h"
 
 /*
@@ -1536,6 +1540,65 @@ static int pipe_to_user(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
 	char *src;
 	int ret;
 
+	if (!buf->offset && (buf->len == PAGE_SIZE) &&
+	    (buf->flags & PIPE_BUF_FLAG_GIFT) && (sd->flags & SPLICE_F_MOVE)) {
+		struct page *page = buf->page;
+		struct mm_struct *mm;
+		struct vm_area_struct *vma;
+		spinlock_t *ptl;
+		pte_t *ptep, pte;
+		unsigned long useraddr;
+
+		if (!PageAnon(page))
+			goto copy;
+		if (PageCompound(page))
+			goto copy;
+		if (PageHuge(page) || PageTransHuge(page))
+			goto copy;
+		if (page_mapped(page))
+			goto copy;
+		useraddr = (unsigned long)sd->u.userptr;
+		mm = current->mm;
+
+		ret = -EAGAIN;
+		down_read(&mm->mmap_sem);
+		vma = find_vma_intersection(mm, useraddr, useraddr + PAGE_SIZE);
+		if (IS_ERR_OR_NULL(vma))
+			goto up_copy;
+		if (!vma->anon_vma) {
+			ret = anon_vma_prepare(vma);
+			if (ret)
+				goto up_copy;
+		}
+		zap_page_range(vma, useraddr, PAGE_SIZE, NULL);
+		ret = lock_page_killable(page);
+		if (ret)
+			goto up_copy;
+		ptep = get_locked_pte(mm, useraddr, &ptl);
+		if (!ptep)
+			goto unlock_up_copy;
+		pte = *ptep;
+		if (pte_present(pte))
+			goto unlock_up_copy;
+		get_page(page);
+		page_add_anon_rmap(page, vma, useraddr);
+		pte = mk_pte(page, vma->vm_page_prot);
+		set_pte_at(mm, useraddr, ptep, pte);
+		update_mmu_cache(vma, useraddr, ptep);
+		pte_unmap_unlock(ptep, ptl);
+		ret = 0;
+unlock_up_copy:
+		unlock_page(page);
+up_copy:
+		up_read(&mm->mmap_sem);
+		if (!ret) {
+			ret = sd->len;
+			goto out;
+		}
+		/* else ret < 0 and we should fallback to copying */
+		VM_BUG_ON(ret > 0);
+	}
+copy:
 	/*
 	 * See if we can use the atomic maps, by prefaulting in the
 	 * pages and doing an atomic copy
-- 
1.8.1.2

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ