lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  PHC 
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Tue, 1 Jan 2013 23:56:05 +0000
From:	Eric Wong <>
To:	Linus Torvalds <>
Cc:	Eric Dumazet <>,
	Linux Kernel Mailing List <>,
	Hans Verkuil <>,
	Jiri Olsa <>, Jonathan Corbet <>,
	Al Viro <>,
	Davide Libenzi <>,
	Hans de Goede <>,
	Mauro Carvalho Chehab <>,
	David Miller <>,
	Andrew Morton <>,
	Andreas Voellmy <>,
	"Junchang(Jason) Wang" <>,
	Network Development <>,
	linux-fsdevel <>
Subject: [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD

Linus Torvalds <> wrote:
> Please document the barrier that this mb() pairs with, and then give
> an explanation for the fix in the commit message, and I'll happily
> take it. Even if it's just duplicating the comments above the
> wq_has_sleeper() function, except modified for the ep_modify() case.

Hopefully my explanation is correct and makes sense below,
I think both effects of the barrier are needed

> Of course, it would be good to get verification from Jason and Andreas
> that the alternate patch also works for them.

Jason just confirmed it.

------------------------------- 8< ----------------------------
>From 02f43757d04bb6f2786e79eecf1cfa82e6574379 Mon Sep 17 00:00:00 2001
From: Eric Wong <>
Date: Tue, 1 Jan 2013 21:20:27 +0000
Subject: [PATCH] epoll: prevent missed events on EPOLL_CTL_MOD

EPOLL_CTL_MOD sets the interest mask before calling f_op->poll() to
ensure events are not missed.  Since the modifications to the interest
mask are not protected by the same lock as ep_poll_callback, we need to
ensure the change is visible to other CPUs calling ep_poll_callback.

We also need to ensure f_op->poll() has an up-to-date view of past
events which occured before we modified the interest mask.  So this
barrier also pairs with the barrier in wq_has_sleeper().

This should guarantee either ep_poll_callback or f_op->poll() (or both)
will notice the readiness of a recently-ready/modified item.

This issue was encountered by Andreas Voellmy and Junchang(Jason) Wang in:

Signed-off-by: Eric Wong <>
Cc: Hans Verkuil <>
Cc: Jiri Olsa <>
Cc: Jonathan Corbet <>
Cc: Al Viro <>
Cc: Davide Libenzi <>
Cc: Hans de Goede <>
Cc: Mauro Carvalho Chehab <>
Cc: David Miller <>
Cc: Eric Dumazet <>
Cc: Andrew Morton <>
Cc: Linus Torvalds <>
Cc: Andreas Voellmy <>
Tested-by: "Junchang(Jason) Wang" <>
 fs/eventpoll.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index cd96649..39573ee 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -1285,7 +1285,7 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
 	 * otherwise we might miss an event that happens between the
 	 * f_op->poll() call and the new event set registering.
-	epi-> = event->events;
+	epi-> = event->events; /* need barrier below */
 	pt._key = event->events;
 	epi-> = event->data; /* protected by mtx */
 	if (epi-> & EPOLLWAKEUP) {
@@ -1296,6 +1296,26 @@ static int ep_modify(struct eventpoll *ep, struct epitem *epi, struct epoll_even
+	 * The following barrier has two effects:
+	 *
+	 * 1) Flush epi changes above to other CPUs.  This ensures
+	 *    we do not miss events from ep_poll_callback if an
+	 *    event occurs immediately after we call f_op->poll().
+	 *    We need this because we did not take ep->lock while
+	 *    changing epi above (but ep_poll_callback does take
+	 *    ep->lock).
+	 *
+	 * 2) We also need to ensure we do not miss _past_ events
+	 *    when calling f_op->poll().  This barrier also
+	 *    pairs with the barrier in wq_has_sleeper (see
+	 *    comments for wq_has_sleeper).
+	 *
+	 * This barrier will now guarantee ep_poll_callback or f_op->poll
+	 * (or both) will notice the readiness of an item.
+	 */
+	smp_mb();
+	/*
 	 * Get current event bits. We can safely use the file* here because
 	 * its usage count has been increased by the caller of this function.
Eric Wong
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to
More majordomo info at
Please read the FAQ at

Powered by blists - more mailing lists