lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Thu, 30 Apr 2009 14:03:47 +0200
From:	Max Kellermann <mk@...all.com>
To:	linux-kernel@...r.kernel.org
Cc:	jens.axboe@...cle.com, w@....eu
Subject: [splice PATCH 2/3] tee: don't return 0 when another task drains/fills
	a pipe

Cite from the tee() manual page:

 "A return value of 0 means that there was no data to transfer, and it
 would not make sense to block, because there are no writers connected
 to the write end of the pipe"

There is however a race condition in the tee() implementation, which
violates this definition:

- do_tee() ensures that ipipe is readable and opipe is writable by
  calling link_ipipe_prep() and link_opipe_prep()
- these two functions unlock the pipe after they have waited
- during this unlocked phase, there is a short window where other
  tasks may drain the input pipe or fill the output pipe
- do_tee() now calls link_pipe(), which re-locks both pipes
- link_pipe() sees that it is unable to read ("i >= ipipe->nrbufs ||
  opipe->nrbufs >= PIPE_BUFFERS") and breaks from the loop
- link_pipe() returns 0

Although there may be writers connected to the input pipe, tee() now
returns 0, and the caller (spuriously) assumes this is the end of the
stream.

This patch wraps the link_[io]pipe_prep() invocation in a loop within
link_pipe(), and loops until the result is reliable.

Signed-off-by: Max Kellermann <mk@...all.com>
---

 fs/splice.c |   47 ++++++++++++++++++++++++++++++++++-------------
 1 files changed, 34 insertions(+), 13 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index f07e304..96135eb 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -1594,15 +1594,41 @@ static long link_pipe(struct pipe_inode_info *ipipe,
 		      size_t len, unsigned int flags)
 {
 	struct pipe_buffer *ibuf, *obuf;
-	long ret = 0;
+	long ret;
 	int i = 0, nbuf;
 
-	/*
-	 * Potential ABBA deadlock, work around it by ordering lock
-	 * grabbing by pipe info address. Otherwise two different processes
-	 * could deadlock (one doing tee from A -> B, the other from B -> A).
-	 */
-	pipe_double_lock(ipipe, opipe);
+	while (1) {
+		/* wait for ipipe to become ready to read */
+		ret = link_ipipe_prep(ipipe, flags);
+		if (ret)
+			return ret;
+
+		/* wait for opipe to become ready to write */
+		ret = link_opipe_prep(opipe, flags);
+		if (ret)
+			return ret;
+
+		/*
+		 * Potential ABBA deadlock, work around it by ordering
+		 * lock grabbing by inode address. Otherwise two
+		 * different processes could deadlock (one doing tee
+		 * from A -> B, the other from B -> A).
+		 */
+		pipe_double_lock(ipipe, opipe);
+
+		/* see if the tee() is still possible */
+		if ((ipipe->nrbufs > 0 || ipipe->writers == 0) &&
+		    opipe->nrbufs < PIPE_BUFFERS)
+			/* yes, it is - keep the locks and end this
+			   loop */
+			break;
+
+		/* no - someone has drained ipipe or has filled opipe
+		   between link_[io]pipe_pre()'s lock and our lock.
+		   Drop both locks and wait again. */
+		pipe_unlock(ipipe);
+		pipe_unlock(opipe);
+	}
 
 	do {
 		if (!opipe->readers) {
@@ -1691,12 +1717,7 @@ static long do_tee(struct file *in, struct file *out, size_t len,
 		 * Keep going, unless we encounter an error. The ipipe/opipe
 		 * ordering doesn't really matter.
 		 */
-		ret = link_ipipe_prep(ipipe, flags);
-		if (!ret) {
-			ret = link_opipe_prep(opipe, flags);
-			if (!ret)
-				ret = link_pipe(ipipe, opipe, len, flags);
-		}
+		ret = link_pipe(ipipe, opipe, len, flags);
 	}
 
 	return ret;

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ