lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <56E7273D.3010403@gmail.com>
Date:	Tue, 15 Mar 2016 10:03:57 +1300
From:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:	Jason Baron <jbaron@...mai.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	mtk.manpages@...il.com, mingo@...nel.org, peterz@...radead.org,
	viro@....linux.org.uk, normalperson@...t.net, m@...odev.com,
	corbet@....net, luto@...capital.net, torvalds@...ux-foundation.org,
	hagen@...u.net, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org
Subject: Re: [PATCH] epoll: add exclusive wakeups flag

Hi Jason,

On 03/15/2016 09:01 AM, Michael Kerrisk (man-pages) wrote:
> Hi Jason,
> 
> On 03/15/2016 08:32 AM, Jason Baron wrote:
>>
>>
>> On 03/14/2016 01:47 PM, Michael Kerrisk (man-pages) wrote:
>>> [Restoring CC, which I see I accidentally dropped, one iteration back.]

[...]

>>> Returning to the second sentence in this description:
>>>
>>>               When a wakeup event occurs and multiple epoll file descrip‐
>>>               tors are attached to the same target file using EPOLLEXCLU‐
>>>               SIVE, one or  more  of  the  epoll  file  descriptors  will
>>>               receive  an  event with epoll_wait(2).
>>>
>>> There is a point that is unclear to me: what does "target file" refer to?
>>> Is it an open file description (aka open file table entry) or an inode?
>>> I suspect the former, but it was not clear in your original text.
>>>
>>
>> So from epoll's perspective, the wakeups are associated with a 'wait
>> queue'. So if the open() and subsequent EPOLL_CTL_ADD (which is done via
>> file->poll()) results in adding to the same 'wait queue' then we will
>> get 'exclusive' wakeup behavior.
>>
>> So in general, I think the answer here is that its associated with the
>> inode (I coudn't say with 100% certainty without really looking at all
>> file->poll() implementations). Certainly, with the 'FIFO' example below,
>> the two scenarios will have the same behavior with respect to
>> EPOLLEXCLUSIVE.

So, I was actually a little surprised by this, and went away and tested
this point. It appears to me that that the two scenarios described below
do NOT have the same behavior with respect to EPOLLEXCLUSIVE. See below.

> So, in both scenarios, *one or more* processes will get a wakeup?
> (I'll try to add something to the text to clarify the detail we're 
> discussing.)
> 
>> Also, the 'non-exclusive' mode would be subject to the same question of
>> which wait queue is the epfd is associated with...
> 
> I'm not sure of the point you are trying to make here?
> 
> Cheers,
> 
> Michael
> 
> 
>>> To make this point even clearer, here are two scenarios I'm thinking of.
>>> In each case, we're talking of monitoring the read end of a FIFO.
>>>
>>> ===
>>>
>>> Scenario 1:
>>>
>>> We have three processes each of which
>>> 1. Creates an epoll instance
>>> 2. Opens the read end of the FIFO
>>> 3. Adds the read end of the FIFO to the epoll instance, specifying
>>>    EPOLLEXCLUSIVE
>>>
>>> When input becomes available on the FIFO, how many processes
>>> get a wakeup?

When I test this scenario, all three processes get a wakeup.

>>> ===
>>>
>>> Scenario 3
>>>
>>> A parent process opens the read end of a FIFO and then calls
>>> fork() three times to create three children. Each child then:
>>>
>>> 1. Creates an epoll instance
>>> 2. Adds the read end of the FIFO to the epoll instance, specifying
>>> EPOLLEXCLUSIVE
>>>
>>> When input becomes available on the FIFO, how many processes
>>> get a wakeup?

When I test this scenario, one process gets a wakeup.

In other words, "target file" appears to mean open file description
(aka open file table entry), not inode.

This is actually what I suspected might be the case, but now I am
puzzled. Given what I've discovered and what you suggest are the
semantics, is the implementation correct? (I suspect that it is,
but it is at odds with your statement above. My test programs are
inline below.

Cheers,

Michael

============

/* t_EPOLLEXCLUSIVE_multipen.c

   Licensed under GNU GPLv2 or later.
*/
#include <sys/epoll.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/types.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

#define usageErr(msg, progName) \
                        do { fprintf(stderr, "Usage: "); \
                             fprintf(stderr, msg, progName); \
                             exit(EXIT_FAILURE); } while (0)

#ifndef EPOLLEXCLUSIVE
#define EPOLLEXCLUSIVE (1 << 28)
#endif

int
main(int argc, char *argv[])
{
    int fd, epfd, nready;
    struct epoll_event ev, rev;

    if (argc != 2 || strcmp(argv[1], "--help") == 0)
        usageErr("%s <FIFO>n", argv[0]);

    epfd = epoll_create(2);
    if (epfd == -1)
        errExit("epoll_create");

    fd = open(argv[1], O_RDONLY);
    if (fd == -1)
        errExit("open");
    printf("Opened %s\n", argv[1]);

    ev.events = EPOLLIN | EPOLLEXCLUSIVE;
    if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
        errExit("epoll_ctl");

    nready = epoll_wait(epfd, &rev, 1, -1);
    if (nready == -1)
        errExit("epoll-wait");
    printf("epoll_wait() returned %d\n", nready);

    exit(EXIT_SUCCESS);
}

===============

/* t_EPOLLEXCLUSIVE_fork.c 

   Licensed under GNU GPLv2 or later.
*/

#include <sys/epoll.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define errExit(msg)    do { perror(msg); exit(EXIT_FAILURE); \
                        } while (0)

#define usageErr(msg, progName) \
                        do { fprintf(stderr, "Usage: "); \
                             fprintf(stderr, msg, progName); \
                             exit(EXIT_FAILURE); } while (0)

#ifndef EPOLLEXCLUSIVE
#define EPOLLEXCLUSIVE (1 << 28)
#endif

int
main(int argc, char *argv[])
{
    int fd, epfd, nready;
    struct epoll_event ev, rev;
    int cnum;

    if (argc != 2 || strcmp(argv[1], "--help") == 0)
        usageErr("%s <FIFO>n", argv[0]);

    fd = open(argv[1], O_RDONLY);
    if (fd == -1)
        errExit("open");
    printf("Opened %s\n", argv[1]);

    for (cnum = 0; cnum < 3; cnum++) {
        switch (fork()) {
        case -1:
            errExit("fork");

        case 0: /* Child */
            epfd = epoll_create(2);
            if (epfd == -1)
                errExit("epoll_create");

            ev.events = EPOLLIN | EPOLLEXCLUSIVE;
            if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) == -1)
                errExit("epoll_ctl");

            nready = epoll_wait(epfd, &rev, 1, -1);
            if (nready == -1)
                errExit("epoll-wait");
            printf("Child %d: epoll_wait() returned %d\n", cnum, nready);
            exit(EXIT_SUCCESS);

        default:
            break;
        }
    }

    wait(NULL);
    wait(NULL);
    wait(NULL);

    exit(EXIT_SUCCESS);
}

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ