lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <de62d81aa9c8865a2af30c10158462f0@silodev.com>
Date:	Tue, 15 Mar 2016 01:09:21 +0200
From:	Madars Vitolins <m@...odev.com>
To:	Jason Baron <jbaron@...mai.com>,
	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>, mingo@...nel.org,
	peterz@...radead.org, viro@....linux.org.uk, normalperson@...t.net,
	corbet@....net, luto@...capital.net, torvalds@...ux-foundation.org,
	hagen@...u.net, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org
Subject: Re: [PATCH] epoll: add exclusive wakeups flag

Hi Jason and Michael,

Hmm... I tried to play with those pipe samples bellow, but even with 
sleep I got that all process wakeups (maybe I miss something too), also 
tried with EPOLLIN.

On same bases I created sample with Posix Queues with EPOLLIN | 
EPOLLEXCLUSIVE and the goods news are that it works correctly.

file q.c:
==================
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <sys/epoll.h>
#include <fcntl.h>
#include <sys/wait.h>
#include <errno.h>
#include <mqueue.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                         } while (0)

#define usageErr(msg, progName) \
                         do { fprintf(stderr, "Usage: "); \
                              fprintf(stderr, msg, progName); \
                              exit(EXIT_FAILURE); } while (0)

#ifndef EPOLLEXCLUSIVE
#define EPOLLEXCLUSIVE (1 << 28)
#endif

#define MAX_SIZE 10

int
main (int argc, char *argv[])
{
   int epfd, nready;
   struct epoll_event ev, rev;
   mqd_t fd;
   struct mq_attr attr;
   char buffer[MAX_SIZE + 1];
   int cnum;

   /* initialize the queue attributes */
   attr.mq_flags = 0;
   attr.mq_maxmsg = 5;
   attr.mq_msgsize = MAX_SIZE;
   attr.mq_curmsgs = 0;

   /* cleanup for multiple runs... */
   mq_unlink ("/TESTQ");

   /* create the message queue */
   fd =
     mq_open ("/TESTQ", O_CREAT | O_RDWR | O_NONBLOCK, S_IWUSR | S_IRUSR,
	     &attr);
   if (fd == -1)
     errExit ("open");

   for (cnum = 0; cnum < 3; cnum++)
     {
       switch (fork ())
	{
	case -1:
	  errExit ("fork");

	case 0:		/* Child */
	  epfd = epoll_create (2);
	  if (epfd == -1)
	    errExit ("epoll_create");

	  ev.events = EPOLLIN | EPOLLEXCLUSIVE;
	  if (epoll_ctl (epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
	    errExit ("epoll_ctl");

	  printf ("About to wait...\n");
	  nready = epoll_wait (epfd, &rev, 1, -1);
	  if (nready == -1)
	    errExit ("epoll-wait");

	  printf ("Child %d: epoll_wait() returned %d\n", cnum, nready);
	  exit (EXIT_SUCCESS);

	default:
	  break;
	}
     }
   sleep (1);
   /* send a msq to Q */
   memset (buffer, 0, MAX_SIZE);
   if (0 > mq_send (fd, buffer, MAX_SIZE, 0))
     errExit ("mq_send");
   printf ("msg sent ok...\n");

   wait (NULL);
   wait (NULL);
   wait (NULL);

   exit (EXIT_SUCCESS);
}
==================

$ gcc q.c -lrt
$ ./a.out
About to wait...
About to wait...
About to wait...
msg sent ok...
Child 2: epoll_wait() returned 1
^C
$



Best regards,
Madars


Jason Baron @ 2016-03-15 00:35 rakstīja:
> Hi Michael,
> 
> On 03/14/2016 05:03 PM, Michael Kerrisk (man-pages) wrote:
>> Hi Jason,
>> 
>> On 03/15/2016 09:01 AM, Michael Kerrisk (man-pages) wrote:
>>> Hi Jason,
>>> 
>>> On 03/15/2016 08:32 AM, Jason Baron wrote:
>>>> 
>>>> 
>>>> On 03/14/2016 01:47 PM, Michael Kerrisk (man-pages) wrote:
>>>>> [Restoring CC, which I see I accidentally dropped, one iteration 
>>>>> back.]
>> 
>> [...]
>> 
>>>>> Returning to the second sentence in this description:
>>>>> 
>>>>>               When a wakeup event occurs and multiple epoll file 
>>>>> descrip‐
>>>>>               tors are attached to the same target file using 
>>>>> EPOLLEXCLU‐
>>>>>               SIVE, one or  more  of  the  epoll  file  descriptors 
>>>>>  will
>>>>>               receive  an  event with epoll_wait(2).
>>>>> 
>>>>> There is a point that is unclear to me: what does "target file" 
>>>>> refer to?
>>>>> Is it an open file description (aka open file table entry) or an 
>>>>> inode?
>>>>> I suspect the former, but it was not clear in your original text.
>>>>> 
>>>> 
>>>> So from epoll's perspective, the wakeups are associated with a 'wait
>>>> queue'. So if the open() and subsequent EPOLL_CTL_ADD (which is done 
>>>> via
>>>> file->poll()) results in adding to the same 'wait queue' then we 
>>>> will
>>>> get 'exclusive' wakeup behavior.
>>>> 
>>>> So in general, I think the answer here is that its associated with 
>>>> the
>>>> inode (I coudn't say with 100% certainty without really looking at 
>>>> all
>>>> file->poll() implementations). Certainly, with the 'FIFO' example 
>>>> below,
>>>> the two scenarios will have the same behavior with respect to
>>>> EPOLLEXCLUSIVE.
>> 
>> So, I was actually a little surprised by this, and went away and 
>> tested
>> this point. It appears to me that that the two scenarios described 
>> below
>> do NOT have the same behavior with respect to EPOLLEXCLUSIVE. See 
>> below.
>> 
>>> So, in both scenarios, *one or more* processes will get a wakeup?
>>> (I'll try to add something to the text to clarify the detail we're
>>> discussing.)
>>> 
>>>> Also, the 'non-exclusive' mode would be subject to the same question 
>>>> of
>>>> which wait queue is the epfd is associated with...
>>> 
>>> I'm not sure of the point you are trying to make here?
>>> 
>>> Cheers,
>>> 
>>> Michael
>>> 
>>> 
>>>>> To make this point even clearer, here are two scenarios I'm 
>>>>> thinking of.
>>>>> In each case, we're talking of monitoring the read end of a FIFO.
>>>>> 
>>>>> ===
>>>>> 
>>>>> Scenario 1:
>>>>> 
>>>>> We have three processes each of which
>>>>> 1. Creates an epoll instance
>>>>> 2. Opens the read end of the FIFO
>>>>> 3. Adds the read end of the FIFO to the epoll instance, specifying
>>>>>    EPOLLEXCLUSIVE
>>>>> 
>>>>> When input becomes available on the FIFO, how many processes
>>>>> get a wakeup?
>> 
>> When I test this scenario, all three processes get a wakeup.
>> 
>>>>> ===
>>>>> 
>>>>> Scenario 3
>>>>> 
>>>>> A parent process opens the read end of a FIFO and then calls
>>>>> fork() three times to create three children. Each child then:
>>>>> 
>>>>> 1. Creates an epoll instance
>>>>> 2. Adds the read end of the FIFO to the epoll instance, specifying
>>>>> EPOLLEXCLUSIVE
>>>>> 
>>>>> When input becomes available on the FIFO, how many processes
>>>>> get a wakeup?
>> 
>> When I test this scenario, one process gets a wakeup.
>> 
>> In other words, "target file" appears to mean open file description
>> (aka open file table entry), not inode.
>> 
>> This is actually what I suspected might be the case, but now I am
>> puzzled. Given what I've discovered and what you suggest are the
>> semantics, is the implementation correct? (I suspect that it is,
>> but it is at odds with your statement above. My test programs are
>> inline below.
>> 
>> Cheers,
>> 
>> Michael
>> 
> 
> Thanks for the test cases. So in your first test case, you are exiting
> immediately after the epoll_wait() returns. So this is actually causing
> the next wakeup. And then the 2nd thread returns from epoll_wait() and
> this causes the 3rd wakeup.
> 
> So the wakeups are actually not happening from the write directly, but
> instead from the readers doing a close(). If you do some sort of sleep
> after the epoll_wait() you can confirm the behavior. So I believe this
> is working as expected.
> 
> Thanks,
> 
> -Jason
> 
> 
>> ============
>> 
>> /* t_EPOLLEXCLUSIVE_multipen.c
>> 
>>    Licensed under GNU GPLv2 or later.
>> */
>> #include <sys/epoll.h>
>> #include <sys/stat.h>
>> #include <fcntl.h>
>> #include <sys/types.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <unistd.h>
>> #include <string.h>
>> 
>> #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
>>                         } while (0)
>> 
>> #define usageErr(msg, progName) \
>>                         do { fprintf(stderr, "Usage: "); \
>>                              fprintf(stderr, msg, progName); \
>>                              exit(EXIT_FAILURE); } while (0)
>> 
>> #ifndef EPOLLEXCLUSIVE
>> #define EPOLLEXCLUSIVE (1 << 28)
>> #endif
>> 
>> int
>> main(int argc, char *argv[])
>> {
>>     int fd, epfd, nready;
>>     struct epoll_event ev, rev;
>> 
>>     if (argc != 2 || strcmp(argv[1], "--help") == 0)
>>         usageErr("%s <FIFO>n", argv[0]);
>> 
>>     epfd = epoll_create(2);
>>     if (epfd == -1)
>>         errExit("epoll_create");
>> 
>>     fd = open(argv[1], O_RDONLY);
>>     if (fd == -1)
>>         errExit("open");
>>     printf("Opened %s\n", argv[1]);
>> 
>>     ev.events = EPOLLIN | EPOLLEXCLUSIVE;
>>     if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
>>         errExit("epoll_ctl");
>> 
>>     nready = epoll_wait(epfd, &rev, 1, -1);
>>     if (nready == -1)
>>         errExit("epoll-wait");
>>     printf("epoll_wait() returned %d\n", nready);
>> 
>>     exit(EXIT_SUCCESS);
>> }
>> 
>> ===============
>> 
>> /* t_EPOLLEXCLUSIVE_fork.c
>> 
>>    Licensed under GNU GPLv2 or later.
>> */
>> 
>> #include <sys/epoll.h>
>> #include <sys/stat.h>
>> #include <fcntl.h>
>> #include <sys/types.h>
>> #include <sys/wait.h>
>> #include <stdio.h>
>> #include <stdlib.h>
>> #include <unistd.h>
>> #include <string.h>
>> 
>> #define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
>>                         } while (0)
>> 
>> #define usageErr(msg, progName) \
>>                         do { fprintf(stderr, "Usage: "); \
>>                              fprintf(stderr, msg, progName); \
>>                              exit(EXIT_FAILURE); } while (0)
>> 
>> #ifndef EPOLLEXCLUSIVE
>> #define EPOLLEXCLUSIVE (1 << 28)
>> #endif
>> 
>> int
>> main(int argc, char *argv[])
>> {
>>     int fd, epfd, nready;
>>     struct epoll_event ev, rev;
>>     int cnum;
>> 
>>     if (argc != 2 || strcmp(argv[1], "--help") == 0)
>>         usageErr("%s <FIFO>n", argv[0]);
>> 
>>     fd = open(argv[1], O_RDONLY);
>>     if (fd == -1)
>>         errExit("open");
>>     printf("Opened %s\n", argv[1]);
>> 
>>     for (cnum = 0; cnum < 3; cnum++) {
>>         switch (fork()) {
>>         case -1:
>>             errExit("fork");
>> 
>>         case 0: /* Child */
>>             epfd = epoll_create(2);
>>             if (epfd == -1)
>>                 errExit("epoll_create");
>> 
>>             ev.events = EPOLLIN | EPOLLEXCLUSIVE;
>>             if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
>>                 errExit("epoll_ctl");
>> 
>>             nready = epoll_wait(epfd, &rev, 1, -1);
>>             if (nready == -1)
>>                 errExit("epoll-wait");
>>             printf("Child %d: epoll_wait() returned %d\n", cnum, 
>> nready);
>>             exit(EXIT_SUCCESS);
>> 
>>         default:
>>             break;
>>         }
>>     }
>> 
>>     wait(NULL);
>>     wait(NULL);
>>     wait(NULL);
>> 
>>     exit(EXIT_SUCCESS);
>> }
>> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ