linux-kernel - [PATCH 1/2] fs/epoll: loosen irq safety in ep

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <20180727053432.16679-2-dave@stgolabs.net>
Date:   Thu, 26 Jul 2018 22:34:31 -0700
From:   Davidlohr Bueso <dave@...olabs.net>
To:     akpm@...ux-foundation.org
Cc:     jbaron@...mai.com, viro@...iv.linux.org.uk, peterz@...radead.org,
        linux-kernel@...r.kernel.org, dave@...olabs.net,
        Davidlohr Bueso <dbueso@...e.de>
Subject: [PATCH 1/2] fs/epoll: loosen irq safety in ep_poll()

Similar to other calls, ep_poll() is not called with interrupts
disabled, and we can therefore avoid the irq save/restore dance
and just disable local irqs. In fact, the call should never be
called in irq context at all, considering that the only path is

epoll_wait(2) -> do_epoll_wait() -> ep_poll().

When running on a 2 socket 40-core (ht) IvyBridge a common pipe
based epoll_wait(2) microbenchmark, the following performance
improvements are seen:

    # threads       vanilla         dirty
	 1          1805587	    2106412
	 2          1854064	    2090762
	 4          1805484	    2017436
	 8          1751222	    1974475
	 16         1725299	    1962104
	 32         1378463	    1571233
	 64          787368	     900784

Which is a pretty constantly near 15%.

Also add a lockdep check such that we detect any mischief
before deadlocking.

Signed-off-by: Davidlohr Bueso <dbueso@...e.de>
---
 fs/eventpoll.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index b5e43e11f1e3..88473e6271ef 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1746,11 +1746,12 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 		   int maxevents, long timeout)
 {
 	int res = 0, eavail, timed_out = 0;
-	unsigned long flags;
 	u64 slack = 0;
 	wait_queue_entry_t wait;
 	ktime_t expires, *to = NULL;
 
+	lockdep_assert_irqs_enabled();
+
 	if (timeout > 0) {
 		struct timespec64 end_time = ep_set_mstimeout(timeout);
 
@@ -1763,7 +1764,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 		 * caller specified a non blocking operation.
 		 */
 		timed_out = 1;
-		spin_lock_irqsave(&ep->wq.lock, flags);
+		spin_lock_irq(&ep->wq.lock);
 		goto check_events;
 	}
 
@@ -1772,7 +1773,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 	if (!ep_events_available(ep))
 		ep_busy_loop(ep, timed_out);
 
-	spin_lock_irqsave(&ep->wq.lock, flags);
+	spin_lock_irq(&ep->wq.lock);
 
 	if (!ep_events_available(ep)) {
 		/*
@@ -1814,11 +1815,11 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 				break;
 			}
 
-			spin_unlock_irqrestore(&ep->wq.lock, flags);
+			spin_unlock_irq(&ep->wq.lock);
 			if (!schedule_hrtimeout_range(to, slack, HRTIMER_MODE_ABS))
 				timed_out = 1;
 
-			spin_lock_irqsave(&ep->wq.lock, flags);
+			spin_lock_irq(&ep->wq.lock);
 		}
 
 		__remove_wait_queue(&ep->wq, &wait);
@@ -1828,7 +1829,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 	/* Is it worth to try to dig for events ? */
 	eavail = ep_events_available(ep);
 
-	spin_unlock_irqrestore(&ep->wq.lock, flags);
+	spin_unlock_irq(&ep->wq.lock);
 
 	/*
 	 * Try to transfer events to user space. In case we get 0 events and
-- 
2.16.4