lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <BC680BE9-F6EB-4278-819C-C552009CA884@yale.edu>
Date:	Tue, 11 Dec 2012 17:23:08 -0500
From:	Andreas Voellmy <andreas.voellmy@...e.edu>
To:	viro@...iv.linux.org.uk, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Cc:	Andreas Voellmy <andreas.voellmy@...e.edu>
Subject: epoll with ONESHOT possibly fails to deliver events

Hi list,

I am using epoll for the Linux (version 3.4.0) implementation of the event notification subsystem of GHC's (Glasgow Haskell Compiler) RTS (runtime system). I am running into a bug that has only popped up using many cores (> 16) and under particular kind of load. I've been debugging for a couple of days now, and I can't find the error in the way that I am using epoll. I'm starting to wonder whether I am either misunderstanding the semantics of epoll and TCP sockets (likely) or there may be a bug in epoll itself (less likely). 

Here is a simplified version of my epoll usage: My program is a multithreaded web server. I have one thread per TCP socket and each socket is marked non-blocking. Each thread serving a client socket repeats the following: 

1. receive a single http request's worth of bytes. 
2. send an http response.

For both steps, the thread will do a non-blocking operation (either recv or send) and if and only if the call returns EWOULDBLOCK or EAGAIN, then it calls epoll_ctl to register the socket and then it blocks on a condition variable. When the condition variable is signaled, it will continue where it left off (either about to recv or about to send). The epoll_ctl is performed with operation EPOLL_CTL_ADD if this is the first time the socket is being registered and otherwise is done with EPOLL_CTL_MOD.  The events field is EPOLLIN | EPOLLET | EPOLLONESHOT. 

Another thread, distinct from all of the threads serving particular sockets, is perfoming epoll_wait calls. When sockets are returned as being ready from an epoll_wait call, the thread signals to the condition variable for the socket. Since I am using EPOLLONESHOT, I assume that there is no need to also perform epoll_ctl with EPOLL_CTL_DEL here. 

This guarantees that I only wait for epoll to signal a file's readiness if (a) we hit EAGAIN or EWOULDBLOCK in a recv or send, and (b) we call epoll_ctl to re-arm (or arm if on the first time) the socket on epoll.

The problem I am encountering is that sometimes a thread will block waiting for the readiness signal and will never get notified, even though there is data to be read. This behavior seems to go away when I remove EPOLLONESHOT flag when registering the event. 

Is my use of epoll (as I described here) OK? Is the following sequence possible? 

1. epoll reports activity on socket previously registered with ONESHOT; now socket is deactivated in epoll.
2. call to recv on socket returns EAGAIN or EWOULDBLOCK
3. data arrives on socket
4. epoll_ctl call rearms socket with epoll (with ONESHOT flag).
5. epoll_wait never returns the socket as being ready.

Do I need to first call epoll_ctl and then call recv until I get to EAGAIN, or is it correct to call epoll_ctl for the file only after I've hit EAGAIN on a recv? 

I have looked over the epoll source here: http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=blob;f=fs/eventpoll.c;h=c0b3c70ee87a2b8e0e46c01a87d63ac692aecc71;hb=refs/heads/linux-3.4.y and I don't see how EPOLLONESHOT could result in the event sequence above, but I'm not that familiar with the code, so it would be great if others can confirm as well. 

I am not subscribed to the kernel list, so please include my email on replies.

Cheers,
Andi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ