lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <2F8919C4-395F-4029-8D4F-AD16A3815FC5@yale.edu>
Date:	Thu, 13 Dec 2012 10:29:17 -0500
From:	Andreas Voellmy <andreas.voellmy@...e.edu>
To:	Eric Wong <normalperson@...t.net>
Cc:	viro@...iv.linux.org.uk, linux-fsdevel@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: epoll with ONESHOT possibly fails to deliver events

Hi Eric, 

On Dec 13, 2012, at 4:32 AM, Eric Wong <normalperson@...t.net> wrote:

> Andreas Voellmy <andreas.voellmy@...e.edu> wrote:
> 
>>> Another thread, distinct from all of the threads serving particular
>>> sockets, is perfoming epoll_wait calls. When sockets are returned as
>>> being ready from an epoll_wait call, the thread signals to the
>>> condition variable for the socket.
> 
> Perhaps there is a bug in the way your epoll_wait thread
> uses the condition variable to notify other threads?
> 

This is possible; I've tried very hard (e.g. I added assertions to check various error conditions) to ensure that there is problem in signaling the other threads. From everything I can tell, it is working properly.

> 
>>> The problem I am encountering is that sometimes a thread will block
>>> waiting for the readiness signal and will never get notified, even
>>> though there is data to be read. This behavior seems to go away when
>>> I remove EPOLLONESHOT flag when registering the event. 
> 
> Is the thread the one waiting on the condition variable or epoll_wait?
> In your situation (stream I/O via multiple threads, single epoll
> descriptor), I think EPOLLONESHOT is the /only/ sane thing to do.

The one waiting on the condition variable.

I think I've narrowed down the problem a bit more. In my program I have multiple epoll instances. Most of the epoll instances are for monitoring sockets. One is used for monitoring an eventfd that is written to by other threads. The problem only occurs when I write to the eventfd after servicing each http request on a socket; i.e. the epoll monitoring the eventfd is returning from a blocking epoll_wait call very frequently . If I don't do that write, or if I use a different notification facility, for example poll, to monitor the eventfd, then the problem goes away.  So it looks like there may be some way in which different epoll instances can interfere with each other. 

Probably this setup sounds weird to you, but I'm trying to spare you from understanding my whole application;  this is part of a multicore runtime system for a programming language with user-level threads and to explain the full story of this would probably take more time than you want to spend.   But I can provide more detail if you like. 

-Andi--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ