lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <56E6F941.9040307@gmail.com>
Date:	Tue, 15 Mar 2016 06:47:45 +1300
From:	"Michael Kerrisk (man-pages)" <mtk.manpages@...il.com>
To:	Jason Baron <jbaron@...mai.com>,
	Andrew Morton <akpm@...ux-foundation.org>
Cc:	mtk.manpages@...il.com, mingo@...nel.org, peterz@...radead.org,
	viro@....linux.org.uk, normalperson@...t.net, m@...odev.com,
	corbet@....net, luto@...capital.net, torvalds@...ux-foundation.org,
	hagen@...u.net, linux-kernel@...r.kernel.org,
	linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org
Subject: Re: [PATCH] epoll: add exclusive wakeups flag

[Restoring CC, which I see I accidentally dropped, one iteration back.]

Hi Jason,

Thanks for the review. I've tweaked one piece to respond to your
feedback. But I also have another new question below.

On 03/15/2016 03:55 AM, Jason Baron wrote:
> On 03/11/2016 06:25 PM, Michael Kerrisk (man-pages) wrote:
>> On 03/11/2016 09:51 PM, Jason Baron wrote:
>>> On 03/11/2016 03:30 PM, Michael Kerrisk (man-pages) wrote:

[...]

> Hi Michael,
> 
> Looks good. One comment below.
> 
> Thanks,
> 
>>        EPOLLEXCLUSIVE (since Linux 4.5)
>>               Sets  an  exclusive  wakeup  mode  for  the  epoll  file
>>               descriptor  that  is  being  attached to the target file
>>               descriptor, fd.  When a wakeup event occurs and multiple
>>               epoll  file  descriptors are attached to the same target
>>               file using EPOLLEXCLUSIVE, one or more of the epoll file
>>               descriptors  will  receive  an event with epoll_wait(2).
>>               The default in this scenario (when EPOLLEXCLUSIVE is not
>>               set)  is  for  all  epoll file descriptors to receive an
>>               event.  EPOLLEXCLUSIVE is thus useful for avoiding thun‐
>>               dering herd problems in certain scenarios.
>>
>>               If  the  same  file  descriptor  is  in  multiple  epoll
>>               instances, some with the EPOLLEXCLUSIVE flag, and others
>>               without,   then   events  will  provided  to  all  epoll
>>               instances that did not specify  EPOLLEXCLUSIVE,  and  at
>>               least  one  of  the  epoll  instances  that  did specify
>>               EPOLLEXCLUSIVE.
>>
>>               The following values may  be  specified  in  conjunction
>>               with EPOLLEXCLUSIVE: EPOLLIN, EPOLLOUT, EPOLLWAKEUP, and
>>               EPOLLET.  EPOLLHUP and EPOLLERR can also  be  specified,
>>               but  are  ignored (as usual).  Attempts to specify other
> 
> I'm not sure 'ignored' is the right wording here. 'EPOLLHUP' and
> 'EPOLERR' are always included in the set of events when something is
> added as EPOLLEXCLUSIVE. This is consistent with the non-EPOLLEXCLUSIVE
> add case. 

Yes.

> So 'EPOLLHUP' and 'EPOLERR' may be specified but will be
> included in the set of events on an add, whether they are specified or not.

Yes. I understand your discomfort with the work "ignored", but the 
problem was that, because it made special mention of EPOLLHUP and EPOLLERR,
your proposed text made it sound as though EPOLLEXCLUSIVE somehow was
special with respect to these two flags. I wanted to clarify that it is not.
How about this:

              The following values may  be  specified  in  conjunction
              with EPOLLEXCLUSIVE: EPOLLIN, EPOLLOUT, EPOLLWAKEUP, and
              EPOLLET.  EPOLLHUP and EPOLLERR can also  be  specified,
              but  this  is  not  required: as usual, these events are
              always reported if they  occur,  regardless  of  whether
              they are specified in events.
?

>>               values in events yield an error.  EPOLLEXCLUSIVE may  be
>>               used  only  in  an  EPOLL_CTL_ADD operation; attempts to
>>               employ  it  with  EPOLL_CTL_MOD  yield  an  error.    If
>>               EPOLLEXCLUSIVE has set using epoll_ctl(2), then a subse‐
>>               quent EPOLL_CTL_MOD on the same epfd, fd pair yields  an
b>>               error.  An epoll_ctl(2) that specifies EPOLLEXCLUSIVE in
>>               events and specifies the target file descriptor fd as an
>>               epoll  instance will likewise fail.  The error in all of
>>               these cases is EINVAL.
>>
>>    ERRORS
>>        EINVAL An invalid event type was specified along with  EPOLLEX‐
>>               CLUSIVE in events.
>>
>>        EINVAL op was EPOLL_CTL_MOD and events included EPOLLEXCLUSIVE.
>>
>>        EINVAL op  was  EPOLL_CTL_MOD  and  the EPOLLEXCLUSIVE flag has
>>               previously been applied to this epfd, fd pair.
>>
>>        EINVAL EPOLLEXCLUSIVE was specified in event and fd  is  refers
>>               to an epoll instance.

Returning to the second sentence in this description:

              When a wakeup event occurs and multiple epoll file descrip‐
              tors are attached to the same target file using EPOLLEXCLU‐
              SIVE, one or  more  of  the  epoll  file  descriptors  will
              receive  an  event with epoll_wait(2).

There is a point that is unclear to me: what does "target file" refer to?
Is it an open file description (aka open file table entry) or an inode?
I suspect the former, but it was not clear in your original text.

To make this point even clearer, here are two scenarios I'm thinking of.
In each case, we're talking of monitoring the read end of a FIFO.

===

Scenario 1:

We have three processes each of which
1. Creates an epoll instance
2. Opens the read end of the FIFO
3. Adds the read end of the FIFO to the epoll instance, specifying
   EPOLLEXCLUSIVE

When input becomes available on the FIFO, how many processes
get a wakeup?

===

Scenario 3

A parent process opens the read end of a FIFO and then calls
fork() three times to create three children. Each child then:

1. Creates an epoll instance
2. Adds the read end of the FIFO to the epoll instance, specifying
EPOLLEXCLUSIVE

When input becomes available on the FIFO, how many processes
get a wakeup?

===

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ