lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 14 Mar 2016 18:35:07 -0400
From:	Jason Baron <jbaron@...mai.com>
To:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	mingo@...nel.org, peterz@...radead.org, viro@....linux.org.uk,
	normalperson@...t.net, m@...odev.com, corbet@....net,
	luto@...capital.net, torvalds@...ux-foundation.org, hagen@...u.net,
	linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
	linux-api@...r.kernel.org
Subject: Re: [PATCH] epoll: add exclusive wakeups flag

Hi Michael,

On 03/14/2016 05:03 PM, Michael Kerrisk (man-pages) wrote:
> Hi Jason,
> 
> On 03/15/2016 09:01 AM, Michael Kerrisk (man-pages) wrote:
>> Hi Jason,
>>
>> On 03/15/2016 08:32 AM, Jason Baron wrote:
>>>
>>>
>>> On 03/14/2016 01:47 PM, Michael Kerrisk (man-pages) wrote:
>>>> [Restoring CC, which I see I accidentally dropped, one iteration back.]
> 
> [...]
> 
>>>> Returning to the second sentence in this description:
>>>>
>>>>               When a wakeup event occurs and multiple epoll file descrip‐
>>>>               tors are attached to the same target file using EPOLLEXCLU‐
>>>>               SIVE, one or  more  of  the  epoll  file  descriptors  will
>>>>               receive  an  event with epoll_wait(2).
>>>>
>>>> There is a point that is unclear to me: what does "target file" refer to?
>>>> Is it an open file description (aka open file table entry) or an inode?
>>>> I suspect the former, but it was not clear in your original text.
>>>>
>>>
>>> So from epoll's perspective, the wakeups are associated with a 'wait
>>> queue'. So if the open() and subsequent EPOLL_CTL_ADD (which is done via
>>> file->poll()) results in adding to the same 'wait queue' then we will
>>> get 'exclusive' wakeup behavior.
>>>
>>> So in general, I think the answer here is that its associated with the
>>> inode (I coudn't say with 100% certainty without really looking at all
>>> file->poll() implementations). Certainly, with the 'FIFO' example below,
>>> the two scenarios will have the same behavior with respect to
>>> EPOLLEXCLUSIVE.
> 
> So, I was actually a little surprised by this, and went away and tested
> this point. It appears to me that that the two scenarios described below
> do NOT have the same behavior with respect to EPOLLEXCLUSIVE. See below.
> 
>> So, in both scenarios, *one or more* processes will get a wakeup?
>> (I'll try to add something to the text to clarify the detail we're 
>> discussing.)
>>
>>> Also, the 'non-exclusive' mode would be subject to the same question of
>>> which wait queue is the epfd is associated with...
>>
>> I'm not sure of the point you are trying to make here?
>>
>> Cheers,
>>
>> Michael
>>
>>
>>>> To make this point even clearer, here are two scenarios I'm thinking of.
>>>> In each case, we're talking of monitoring the read end of a FIFO.
>>>>
>>>> ===
>>>>
>>>> Scenario 1:
>>>>
>>>> We have three processes each of which
>>>> 1. Creates an epoll instance
>>>> 2. Opens the read end of the FIFO
>>>> 3. Adds the read end of the FIFO to the epoll instance, specifying
>>>>    EPOLLEXCLUSIVE
>>>>
>>>> When input becomes available on the FIFO, how many processes
>>>> get a wakeup?
> 
> When I test this scenario, all three processes get a wakeup.
> 
>>>> ===
>>>>
>>>> Scenario 3
>>>>
>>>> A parent process opens the read end of a FIFO and then calls
>>>> fork() three times to create three children. Each child then:
>>>>
>>>> 1. Creates an epoll instance
>>>> 2. Adds the read end of the FIFO to the epoll instance, specifying
>>>> EPOLLEXCLUSIVE
>>>>
>>>> When input becomes available on the FIFO, how many processes
>>>> get a wakeup?
> 
> When I test this scenario, one process gets a wakeup.
> 
> In other words, "target file" appears to mean open file description
> (aka open file table entry), not inode.
> 
> This is actually what I suspected might be the case, but now I am
> puzzled. Given what I've discovered and what you suggest are the
> semantics, is the implementation correct? (I suspect that it is,
> but it is at odds with your statement above. My test programs are
> inline below.
> 
> Cheers,
> 
> Michael
> 

Thanks for the test cases. So in your first test case, you are exiting
immediately after the epoll_wait() returns. So this is actually causing
the next wakeup. And then the 2nd thread returns from epoll_wait() and
this causes the 3rd wakeup.

So the wakeups are actually not happening from the write directly, but
instead from the readers doing a close(). If you do some sort of sleep
after the epoll_wait() you can confirm the behavior. So I believe this
is working as expected.

Thanks,

-Jason


> ============
> 
> /* t_EPOLLEXCLUSIVE_multipen.c
> 
>    Licensed under GNU GPLv2 or later.
> */
> #include <sys/epoll.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/types.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <string.h>
> 
> #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
>                         } while (0)
> 
> #define usageErr(msg, progName) \
>                         do { fprintf(stderr, "Usage: "); \
>                              fprintf(stderr, msg, progName); \
>                              exit(EXIT_FAILURE); } while (0)
> 
> #ifndef EPOLLEXCLUSIVE
> #define EPOLLEXCLUSIVE (1 << 28)
> #endif
> 
> int
> main(int argc, char *argv[])
> {
>     int fd, epfd, nready;
>     struct epoll_event ev, rev;
> 
>     if (argc != 2 || strcmp(argv[1], "--help") == 0)
>         usageErr("%s <FIFO>n", argv[0]);
> 
>     epfd = epoll_create(2);
>     if (epfd == -1)
>         errExit("epoll_create");
> 
>     fd = open(argv[1], O_RDONLY);
>     if (fd == -1)
>         errExit("open");
>     printf("Opened %s\n", argv[1]);
> 
>     ev.events = EPOLLIN | EPOLLEXCLUSIVE;
>     if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
>         errExit("epoll_ctl");
> 
>     nready = epoll_wait(epfd, &rev, 1, -1);
>     if (nready == -1)
>         errExit("epoll-wait");
>     printf("epoll_wait() returned %d\n", nready);
> 
>     exit(EXIT_SUCCESS);
> }
> 
> ===============
> 
> /* t_EPOLLEXCLUSIVE_fork.c 
> 
>    Licensed under GNU GPLv2 or later.
> */
> 
> #include <sys/epoll.h>
> #include <sys/stat.h>
> #include <fcntl.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <unistd.h>
> #include <string.h>
> 
> #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
>                         } while (0)
> 
> #define usageErr(msg, progName) \
>                         do { fprintf(stderr, "Usage: "); \
>                              fprintf(stderr, msg, progName); \
>                              exit(EXIT_FAILURE); } while (0)
> 
> #ifndef EPOLLEXCLUSIVE
> #define EPOLLEXCLUSIVE (1 << 28)
> #endif
> 
> int
> main(int argc, char *argv[])
> {
>     int fd, epfd, nready;
>     struct epoll_event ev, rev;
>     int cnum;
> 
>     if (argc != 2 || strcmp(argv[1], "--help") == 0)
>         usageErr("%s <FIFO>n", argv[0]);
> 
>     fd = open(argv[1], O_RDONLY);
>     if (fd == -1)
>         errExit("open");
>     printf("Opened %s\n", argv[1]);
> 
>     for (cnum = 0; cnum < 3; cnum++) {
>         switch (fork()) {
>         case -1:
>             errExit("fork");
> 
>         case 0: /* Child */
>             epfd = epoll_create(2);
>             if (epfd == -1)
>                 errExit("epoll_create");
> 
>             ev.events = EPOLLIN | EPOLLEXCLUSIVE;
>             if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
>                 errExit("epoll_ctl");
> 
>             nready = epoll_wait(epfd, &rev, 1, -1);
>             if (nready == -1)
>                 errExit("epoll-wait");
>             printf("Child %d: epoll_wait() returned %d\n", cnum, nready);
>             exit(EXIT_SUCCESS);
> 
>         default:
>             break;
>         }
>     }
> 
>     wait(NULL);
>     wait(NULL);
>     wait(NULL);
> 
>     exit(EXIT_SUCCESS);
> }
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ