lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Pine.LNX.4.64.0810261452221.19212@alien.or.mcafeemobile.com>
Date:	Sun, 26 Oct 2008 15:07:24 -0700 (PDT)
From:	Davide Libenzi <davidel@...ilserver.org>
To:	Paul P <ppak_98@...oo.com>
cc:	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: unexpected extra pollout events from epoll

On Sun, 26 Oct 2008, Paul P wrote:

> I am programming a server using the epoll interface and have the receive portion of the server working fine, but for some reason as I implement the send portion, I noticed a few things that seem like strange behaviors in the implementation of epoll in the kernel.
> 
> I'm running Opensuse 11 and it has a 2.6.25 kernel.
> 
> The behavior that I can seeing is when I do a full read on an edge 
> triggered fd, for some reason, it seems to be triggering an epollout 
> event after each loop of the read events on a socket. (before I've done 
> any writes at all to the socket)
> 
> This is very strange behavior as I would expect that the epollout event 
> would only be triggered if I did a write and the socket recieved an ack 
> which cleared out the send buffer.
> 
> The documentation on epollout is really sparse, so any help at all from 
> the list would be very much appreciated.  Do I need to manually arm the 
> epollout flag after a write?  I thought this was only necessary for 
> level triggered epoll.

The way epoll works, is by hooking into the existing kernel poll 
subsystem. It hooks into the poll wakeups, via callback, and it that way 
it knows that "something" is changed. Then it reads the status of a file 
via f_op->poll() to know the status.
What happens is that, if you listen for EPOLLIN|EPOLLOUT, when a packet 
arrives the callback hook is hit, and the file is put into a maybe-ready 
list. Maybe-ready because at the time of the callback, epoll has no clue 
of what happened.
After that, via epoll_wait(), f_op->poll() is called to get the status of 
the file, and since POLLIN|POLLOUT is returned (and since you're listening 
for EPOLLIN|EPOLLOUT), that gets reported back to you.
The POLLOUT event, by meaning a buffer-full->buffer-avail transition, did 
not really happen, but since POLLOUT is true, that gets reported back too.
This, again, since epoll has no clue of what happened at callback hit time.
I'm working on changes that will make epoll aware (by using the existing 
support for the "key" parameter of wakeups) of events at callback time, 
but this is something that is still up for discussion and definitely won't 
be in .28.
The best way to do it ATM, is to wait for POLLOUT only when really needed.




- Davide


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ