[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20190427093319.sgicqik2oqkez3wk@dcvr>
Date: Sat, 27 Apr 2019 09:33:19 +0000
From: Eric Wong <e@...24.org>
To: Deepa Dinamani <deepa.kernel@...il.com>,
Arnd Bergmann <arnd@...db.de>,
Davidlohr Bueso <dave@...olabs.net>,
Al Viro <viro@...iv.linux.org.uk>,
Jason Baron <jbaron@...mai.com>
Cc: linux-kernel@...r.kernel.org, Omar Kilani <omar.kilani@...il.com>,
linux-fsdevel@...r.kernel.org
Subject: Re: Strange issues with epoll since 5.0
Eric Wong <e@...24.org> wrote:
> Omar Kilani <omar.kilani@...il.com> wrote:
> > Hi there,
> >
> > I’m still trying to piece together a reproducible test that triggers
> > this, but I wanted to post in case someone goes “hmmm... change X
> > might have done this”.
>
> Maybe Davidlohr knows, since he's responsible for most of the
> epoll changes in 5.0.
Well, I am not sure if I am hitting the same problem Omar is
hitting. But I did find an epoll_pwait regression in 5.0:
epoll_pwait seems unresponsive to SIGURG in my
heavily-parallelized use case[1] on 5.0.9. I bisected it to
commit 854a6ed56839a40f6b5d02a2962f48841482eec4
("signal: Add restore_user_sigmask()")
Just reverting the fs/eventpoll.c change in 854a6ed56 seems
enough to fix the non-responsive epoll_pwait for me. I have not
looked deeply into this, but perhaps the signal_pending check in
restore_user_sigmask is racy w.r.t. epoll. It is been a while
since I have looked at kernel stuff, myself.
Anyways, this revert works; but I'm not 100% sure why...
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index a5d219d920e7..151739d76801 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -2247,7 +2247,20 @@ SYSCALL_DEFINE6(epoll_pwait, int, epfd, struct epoll_event __user *, events,
error = do_epoll_wait(epfd, events, maxevents, timeout);
- restore_user_sigmask(sigmask, &sigsaved);
+ /*
+ * If we changed the signal mask, we need to restore the original one.
+ * In case we've got a signal while waiting, we do not restore the
+ * signal mask yet, and we allow do_signal() to deliver the signal on
+ * the way back to userspace, before the signal mask is restored.
+ */
+ if (sigmask) {
+ if (error == -EINTR) {
+ memcpy(¤t->saved_sigmask, &sigsaved,
+ sizeof(sigsaved));
+ set_restore_sigmask();
+ } else
+ set_current_blocked(&sigsaved);
+ }
return error;
}
@@ -2272,7 +2285,20 @@ COMPAT_SYSCALL_DEFINE6(epoll_pwait, int, epfd,
err = do_epoll_wait(epfd, events, maxevents, timeout);
- restore_user_sigmask(sigmask, &sigsaved);
+ /*
+ * If we changed the signal mask, we need to restore the original one.
+ * In case we've got a signal while waiting, we do not restore the
+ * signal mask yet, and we allow do_signal() to deliver the signal on
+ * the way back to userspace, before the signal mask is restored.
+ */
+ if (sigmask) {
+ if (err == -EINTR) {
+ memcpy(¤t->saved_sigmask, &sigsaved,
+ sizeof(sigsaved));
+ set_restore_sigmask();
+ } else
+ set_current_blocked(&sigsaved);
+ }
return err;
}
Comments and/or a proper fix would be greatly appreciated.
[1] my test case is running the cmogstored 1.7.0 test suite
in amd64 Debian stable environment.
test/mgmt_auto_adjust would get stuck and time-out after 60s
on vanilla v5.0.9
tgz: https://bogomips.org/cmogstored/files/cmogstored-1.7.0.tar.gz
# Standard autotools install, N=32 or some high-ish number
./configure
make -j$N
make check -j$N
# OR git clone https://bogomips.org/cmogstored.git
So, requoting the rest of Omar's original report, here; since
I am not sure if his use case involves epoll_pwait like mine does:
> Omar Kilani <omar.kilani@...il.com> wrote:
> > Basically, something’s broken (or at least, has changed enough to
> > cause problems in user space) in epoll since 5.0. It’s still broken in
> > 5.1-rc5.
> >
> > It doesn’t happen 100% of the time. It’s sort of hard to pin down but
> > I’ve observed the following:
> >
> > * nginx not accepting connections under load
> > * A java app which uses netty / NIO having strange writability
> > semantics on channels, which confuses netty / java enough to not
> > properly flush written data on the socket.
> >
> > I went and tested these Linux kernels:
> >
> > 4.20.17
> > 4.19.32
> > 4.14.111
> >
> > And the issue(s) do not show up there.
> >
> > I’m still actively chasing this up, and will report back — I haven’t
> > touched kernel code in 15 years so I’m a little rusty. :)
> >
> > Regards,
> > Omar
Powered by blists - more mailing lists